OW-DETR-基于Pytorch实现OW-DETR开放世界的Transformer目标检测算法-附流程教程+项目源码-优质项目

共74个文件

py：38个

txt：18个

sh：9个

版权申诉

Pytorch

Transformer

目标检测

优质项目

36 浏览量 2024-10-20 18:34:25 上传评论收藏 1.51MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

OW-DETR_基于Pytorch实现OW-DETR开放世界的Transformer目标检测算法_附流程教程+项目源码_优质项目实战.zip （74个子文件）

OW-DETR_基于Pytorch实现OW-DETR开放世界的Transformer目标检测算法_附流程教程+项目源码_优质项目实战

tools

run_dist_launch.sh 517B

launch.py 9KB

run_dist_slurm.sh 1KB

benchmark.py 2KB

main_open_world.py 17KB

data

OWDETR

VOC2007

ImageSets

t4_ft.txt 34KB

t4_train.txt 494KB

t1_train.txt 1.11MB

t2_train.txt 709KB

t3_ft.txt 30KB

t2_ft.txt 21KB

test.txt 63KB

t3_train.txt 500KB

OWOD

VOC2007

ImageSets

all_task_test.txt 101KB

t4_ft.txt 34KB

t4_train.txt 511KB

t1_train.txt 169KB

t2_train.txt 578KB

t3_ft.txt 29KB

t2_ft.txt 21KB

test.txt 63KB

t3_train.txt 500KB

run_slurm.sh 197B

configs

OWOD_our_proposed_split.sh 3KB

OWOD_our_proposed_split_eval.sh 1KB

OWOD_split.sh 3KB

datasets

__init__.py 1KB

create_imagenets_t2.py 2KB

create_imagenets_t3.py 2KB

create_imagenets_t1.py 2KB

coco.py 8KB

torchvision_datasets

__init__.py 329B

coco.py 3KB

open_world.py 11KB

panoptic_eval.py 2KB

samplers.py 5KB

transforms.py 9KB

coco_panoptic.py 4KB

data_prefetcher.py 3KB

create_imagenets_t4.py 2KB

coco_eval.py 9KB

open_world_eval.py 25KB

coco2voc.py 3KB

run_eval.sh 89B

requirements.txt 30B

models

__init__.py 86B

deformable_detr.py 34KB

segmentation.py 16KB

position_encoding.py 4KB

matcher.py 5KB

backbone.py 5KB

ops

setup.py 2KB

src

vision.cpp 799B

cpu

ms_deform_attn_cpu.h 1KB

ms_deform_attn_cpu.cpp 1KB

ms_deform_attn.h 2KB

cuda

ms_deform_attn_cuda.h 1KB

ms_deform_im2col_cuda.cuh 53KB

ms_deform_attn_cuda.cu 7KB

modules

__init__.py 584B

ms_deform_attn.py 6KB

functions

__init__.py 598B

ms_deform_attn_func.py 3KB

test.py 3KB

make.sh 53B

deformable_transformer.py 17KB

run.sh 84B

exps

readme.md 380B

engine.py 9KB

util

__init__.py 1B

box_ops.py 2KB

misc.py 17KB

plot_utils.py 9KB

README.md 7KB

# OW-DETR: Open-world Detection Transformer # Introduction Open-world object detection (OWOD) is a challenging computer vision problem, where the task is to detect a known set of object categories while simultaneously identifying unknown objects. Additionally, the model must incrementally learn new classes that become known in the next training episodes. Distinct from standard object detection, the OWOD setting poses significant challenges for generating quality candidate proposals on potentially unknown objects, separating the unknown objects from the background and detecting diverse unknown objects. Here, we introduce a novel end-to-end transformer-based framework, OW-DETR, for open-world object detection. The proposed OW-DETR comprises three dedicated components namely, attention-driven pseudo-labeling, novelty classification and objectness scoring to explicitly address the aforementioned OWOD challenges. Our OW-DETR explicitly encodes multi-scale contextual information, possesses less inductive bias, enables knowledge transfer from known classes to the unknown class and can better discriminate between unknown objects and background. Comprehensive experiments are performed on two benchmarks: MS-COCO and PASCAL VOC. The extensive ablations reveal the merits of our proposed contributions. Further, our model outperforms the recently introduced OWOD approach, ORE, with absolute gains ranging from $1.8\%$ to $3.3\%$ in terms of unknown recall on MS-COCO. In the case of incremental object detection, OW-DETR outperforms the state-of-the-art for all settings on PASCAL VOC. <br> <p align="center" ><img width='350' src = "https://imgur.com/KXDXiAB.png"></p> <br> <p align="center" ><img width='500' src = "https://imgur.com/cyeMXuh.png"></p> # Installation ### Requirements We have trained and tested our models on `Ubuntu 16.0`, `CUDA 10.2`, `GCC 5.4`, `Python 3.7` ```bash conda create -n owdetr python=3.7 pip conda activate owdetr conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=10.2 -c pytorch pip install -r requirements.txt ``` ### Backbone features Download the self-supervised backbone from [here](https://dl.fbaipublicfiles.com/dino/dino_resnet50_pretrain/dino_resnet50_pretrain.pth) and add in `models` folder. ### Compiling CUDA operators ```bash cd ./models/ops sh ./make.sh # unit test (should see all checking is True) python test.py ``` # Dataset & Results ### OWOD proposed splits <br> <p align="center" ><img width='500' src = "https://imgur.com/9bzf3DV.png"></p> <br> The splits are present inside `data/VOC2007/OWOD/ImageSets/` folder. The remaining dataset can be downloaded using this [link](https://drive.google.com/drive/folders/1S5L-YmIiFMAKTs6nHMorB0Osz5iWI31k?usp=sharing) The files should be organized in the following structure: ``` OW-DETR/ └── data/ └── VOC2007/ └── OWOD/ ├── JPEGImages ├── ImageSets └── Annotations ``` ### Results <table align="center"> <tr> <th> </th> <th align="center" colspan=2>Task1</th> <th align="center" colspan=2>Task2</th> <th align="center" colspan=2>Task3</th> <th align="center" colspan=1>Task4</th> </tr> <tr> <td align="left">Method</td> <td align="center">U-Recall</td> <td align="center">mAP</td> <td align="center">U-Recall</td> <td align="center">mAP</td> <td align="center">U-Recall</td> <td align="center">mAP</td> <td align="center">mAP</td> </tr> <tr> <td align="left">ORE-EBUI</td> <td align="center">4.9</td> <td align="center">56.0</td> <td align="center">2.9</td> <td align="center">39.4</td> <td align="center">3.9</td> <td align="center">29.7</td> <td align="center">25.3</td> </tr> <tr> <td align="left">OW-DETR</td> <td align="center">7.5</td> <td align="center">59.2</td> <td align="center">6.2</td> <td align="center">42.9</td> <td align="center">5.7</td> <td align="center">30.8</td> <td align="center">27.8</td> </tr> </table> ### Our proposed splits <br> <p align="center" ><img width='500' src = "https://imgur.com/RlqbheH.png"></p> <br> #### Dataset Preparation The splits are present inside `data/VOC2007/OWDETR/ImageSets/` folder. 1. Make empty `JPEGImages` and `Annotations` directory. ``` mkdir data/VOC2007/OWDETR/JPEGImages/ mkdir data/VOC2007/OWDETR/Annotations/ ``` 2. Download the COCO Images and Annotations from [coco dataset](https://cocodataset.org/#download). 3. Unzip train2017 and val2017 folder. The current directory structure should look like: ``` OW-DETR/ └── data/ └── coco/ ├── annotations/ ├── train2017/ └── val2017/ ``` 4. Move all images from `train2017/` and `val2017/` to `JPEGImages` folder. ``` cd OW-DETR/data mv data/coco/train2017/*.jpg data/VOC2007/OWDETR/JPEGImages/. mv data/coco/val2017/*.jpg data/VOC2007/OWDETR/JPEGImages/. ``` 5. Use the code `coco2voc.py` for converting json annotations to xml files. The files should be organized in the following structure: ``` OW-DETR/ └── data/ └── VOC2007/ └── OWDETR/ ├── JPEGImages ├── ImageSets └── Annotations ``` Currently, Dataloader and Evaluator followed for OW-DETR is in VOC format. ### Results <table align="center"> <tr> <th> </th> <th align="center" colspan=2>Task1</th> <th align="center" colspan=2>Task2</th> <th align="center" colspan=2>Task3</th> <th align="center" colspan=1>Task4</th> </tr> <tr> <td align="left">Method</td> <td align="center">U-Recall</td> <td align="center">mAP</td> <td align="center">U-Recall</td> <td align="center">mAP</td> <td align="center">U-Recall</td> <td align="center">mAP</td> <td align="center">mAP</td> </tr> <tr> <td align="left">ORE-EBUI</td> <td align="center">1.5</td> <td align="center">61.4</td> <td align="center">3.9</td> <td align="center">40.6</td> <td align="center">3.6</td> <td align="center">33.7</td> <td align="center">31.8</td> </tr> <tr> <td align="left">OW-DETR</td> <td align="center">5.7</td> <td align="center">71.5</td> <td align="center">6.2</td> <td align="center">43.8</td> <td align="center">6.9</td> <td align="center">38.5</td> <td align="center">33.1</td> </tr> </table> # Training #### Training on single node To train OW-DETR on a single node with 8 GPUS, run ```bash ./run.sh ``` #### Training on slurm cluster To train OW-DETR on a slurm cluster having 2 nodes with 8 GPUS each, run ```bash sbatch run_slurm.sh ``` # Evaluation For reproducing any of the above mentioned results please run the `run_eval.sh` file and add pretrained weights accordingly.

评论收藏

内容反馈

版权申诉