**DE⫶TR**: End-to-End Object Detection with Transformers
========
[![Support Ukraine](https://img.shields.io/badge/Support-Ukraine-FFD500?style=flat&labelColor=005BBB)](https://opensource.fb.com/support-ukraine)
PyTorch training code and pretrained models for **DETR** (**DE**tection **TR**ansformer).
We replace the full complex hand-crafted object detection pipeline with a Transformer, and match Faster R-CNN with a ResNet-50, obtaining **42 AP** on COCO using half the computation power (FLOPs) and the same number of parameters. Inference in 50 lines of PyTorch.
![DETR](.github/DETR.png)
**What it is**. Unlike traditional computer vision techniques, DETR approaches object detection as a direct set prediction problem. It consists of a set-based global loss, which forces unique predictions via bipartite matching, and a Transformer encoder-decoder architecture.
Given a fixed small set of learned object queries, DETR reasons about the relations of the objects and the global image context to directly output the final set of predictions in parallel. Due to this parallel nature, DETR is very fast and efficient.
**About the code**. We believe that object detection should not be more difficult than classification,
and should not require complex libraries for training and inference.
DETR is very simple to implement and experiment with, and we provide a
[standalone Colab Notebook](https://colab.research.google.com/github/facebookresearch/detr/blob/colab/notebooks/detr_demo.ipynb)
showing how to do inference with DETR in only a few lines of PyTorch code.
Training code follows this idea - it is not a library,
but simply a [main.py](main.py) importing model and criterion
definitions with standard training loops.
Additionnally, we provide a Detectron2 wrapper in the d2/ folder. See the readme there for more information.
For details see [End-to-End Object Detection with Transformers](https://ai.facebook.com/research/publications/end-to-end-object-detection-with-transformers) by Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko.
# Model Zoo
We provide baseline DETR and DETR-DC5 models, and plan to include more in future.
AP is computed on COCO 2017 val5k, and inference time is over the first 100 val5k COCO images,
with torchscript transformer.
<table>
<thead>
<tr style="text-align: right;">
<th></th>
<th>name</th>
<th>backbone</th>
<th>schedule</th>
<th>inf_time</th>
<th>box AP</th>
<th>url</th>
<th>size</th>
</tr>
</thead>
<tbody>
<tr>
<th>0</th>
<td>DETR</td>
<td>R50</td>
<td>500</td>
<td>0.036</td>
<td>42.0</td>
<td><a href="https://dl.fbaipublicfiles.com/detr/detr-r50-e632da11.pth">model</a> | <a href="https://dl.fbaipublicfiles.com/detr/logs/detr-r50_log.txt">logs</a></td>
<td>159Mb</td>
</tr>
<tr>
<th>1</th>
<td>DETR-DC5</td>
<td>R50</td>
<td>500</td>
<td>0.083</td>
<td>43.3</td>
<td><a href="https://dl.fbaipublicfiles.com/detr/detr-r50-dc5-f0fb7ef5.pth">model</a> | <a href="https://dl.fbaipublicfiles.com/detr/logs/detr-r50-dc5_log.txt">logs</a></td>
<td>159Mb</td>
</tr>
<tr>
<th>2</th>
<td>DETR</td>
<td>R101</td>
<td>500</td>
<td>0.050</td>
<td>43.5</td>
<td><a href="https://dl.fbaipublicfiles.com/detr/detr-r101-2c7b67e5.pth">model</a> | <a href="https://dl.fbaipublicfiles.com/detr/logs/detr-r101_log.txt">logs</a></td>
<td>232Mb</td>
</tr>
<tr>
<th>3</th>
<td>DETR-DC5</td>
<td>R101</td>
<td>500</td>
<td>0.097</td>
<td>44.9</td>
<td><a href="https://dl.fbaipublicfiles.com/detr/detr-r101-dc5-a2e86def.pth">model</a> | <a href="https://dl.fbaipublicfiles.com/detr/logs/detr-r101-dc5_log.txt">logs</a></td>
<td>232Mb</td>
</tr>
</tbody>
</table>
COCO val5k evaluation results can be found in this [gist](https://gist.github.com/szagoruyko/9c9ebb8455610958f7deaa27845d7918).
The models are also available via torch hub,
to load DETR R50 with pretrained weights simply do:
```python
model = torch.hub.load('facebookresearch/detr:main', 'detr_resnet50', pretrained=True)
```
COCO panoptic val5k models:
<table>
<thead>
<tr style="text-align: right;">
<th></th>
<th>name</th>
<th>backbone</th>
<th>box AP</th>
<th>segm AP</th>
<th>PQ</th>
<th>url</th>
<th>size</th>
</tr>
</thead>
<tbody>
<tr>
<th>0</th>
<td>DETR</td>
<td>R50</td>
<td>38.8</td>
<td>31.1</td>
<td>43.4</td>
<td><a href="https://dl.fbaipublicfiles.com/detr/detr-r50-panoptic-00ce5173.pth">download</a></td>
<td>165Mb</td>
</tr>
<tr>
<th>1</th>
<td>DETR-DC5</td>
<td>R50</td>
<td>40.2</td>
<td>31.9</td>
<td>44.6</td>
<td><a href="https://dl.fbaipublicfiles.com/detr/detr-r50-dc5-panoptic-da08f1b1.pth">download</a></td>
<td>165Mb</td>
</tr>
<tr>
<th>2</th>
<td>DETR</td>
<td>R101</td>
<td>40.1</td>
<td>33</td>
<td>45.1</td>
<td><a href="https://dl.fbaipublicfiles.com/detr/detr-r101-panoptic-40021d53.pth">download</a></td>
<td>237Mb</td>
</tr>
</tbody>
</table>
Checkout our [panoptic colab](https://colab.research.google.com/github/facebookresearch/detr/blob/colab/notebooks/DETR_panoptic.ipynb)
to see how to use and visualize DETR's panoptic segmentation prediction.
# Notebooks
We provide a few notebooks in colab to help you get a grasp on DETR:
* [DETR's hands on Colab Notebook](https://colab.research.google.com/github/facebookresearch/detr/blob/colab/notebooks/detr_attention.ipynb): Shows how to load a model from hub, generate predictions, then visualize the attention of the model (similar to the figures of the paper)
* [Standalone Colab Notebook](https://colab.research.google.com/github/facebookresearch/detr/blob/colab/notebooks/detr_demo.ipynb): In this notebook, we demonstrate how to implement a simplified version of DETR from the grounds up in 50 lines of Python, then visualize the predictions. It is a good starting point if you want to gain better understanding the architecture and poke around before diving in the codebase.
* [Panoptic Colab Notebook](https://colab.research.google.com/github/facebookresearch/detr/blob/colab/notebooks/DETR_panoptic.ipynb): Demonstrates how to use DETR for panoptic segmentation and plot the predictions.
# Usage - Object detection
There are no extra compiled components in DETR and package dependencies are minimal,
so the code is very simple to use. We provide instructions how to install dependencies via conda.
First, clone the repository locally:
```
git clone https://github.com/facebookresearch/detr.git
```
Then, install PyTorch 1.5+ and torchvision 0.6+:
```
conda install -c pytorch pytorch torchvision
```
Install pycocotools (for evaluation on COCO) and scipy (for training):
```
conda install cython scipy
pip install -U 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'
```
That's it, should be good to train and evaluate detection models.
(optional) to work with panoptic install panopticapi:
```
pip install git+https://github.com/cocodataset/panopticapi.git
```
## Data preparation
Download and extract COCO 2017 train and val images with annotations from
[http://cocodataset.org](http://cocodataset.org/#download).
We expect the directory structure to be the following:
```
path/to/coco/
annotations/ # annotation json files
train2017/ # train images
val2017/ # val images
```
## Training
To train baseline DETR on a single node with 8 gpus for 300 epochs run:
```
python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --coco_path /path/to/coco
```
A single epoch takes 28 minutes, so 300 epoch training
takes a
没有合适的资源?快使用搜索试试~ 我知道了~
资源推荐
资源详情
资源评论
收起资源包目录
LOGO检测_基于DETR目标检测算法实现的LOGO标志检测识别_附流程教程+项目源码_优质项目实战.zip (66个子文件)
LOGO检测_基于DETR目标检测算法实现的LOGO标志检测识别_附流程教程+项目源码_优质项目实战
classid_counts.py 852B
preproc_annot.py 3KB
detr
test_all.py 9KB
.github
ISSUE_TEMPLATE
bugs.md 725B
unexpected-problems-bugs.md 1KB
questions-help-support.md 791B
CONTRIBUTING.md 2KB
CODE_OF_CONDUCT.md 244B
DETR.png 172KB
.circleci
config.yml 703B
main.py 11KB
LICENSE 11KB
hubconf.py 6KB
tox.ini 65B
datasets
__init__.py 1KB
coco.py 5KB
panoptic_eval.py 1KB
transforms.py 8KB
coco_panoptic.py 4KB
flickr_logos_27.py 660B
coco_eval.py 9KB
Dockerfile 328B
requirements.txt 224B
models
__init__.py 143B
segmentation.py 15KB
position_encoding.py 3KB
matcher.py 4KB
backbone.py 4KB
detr.py 17KB
transformer.py 12KB
.gitignore 189B
engine.py 6KB
util
__init__.py 71B
box_ops.py 3KB
misc.py 15KB
plot_utils.py 4KB
README.md 11KB
d2
detr
__init__.py 176B
dataset_mapper.py 4KB
detr.py 11KB
config.py 888B
configs
detr_256_6_6_torchvision.yaml 1012B
detr_segm_256_6_6_torchvision.yaml 1KB
converter.py 3KB
README.md 2KB
train_net.py 5KB
run_with_submitit.py 3KB
gen_images_np.py 2KB
Train_DeepLogo2_by_detr.ipynb 15KB
flickr_logos_27_label_map.pbtxt 1KB
detect_results
2459989055.jpg 115KB
3154099645.jpg 131KB
4585454903.jpg 109KB
2818828296.jpg 162KB
2368653535.jpg 156KB
229746392.jpg 80KB
2650126854.jpg 138KB
gen_tfrecord.py 3KB
data_loader.py 4KB
delete_head_and_save.py 459B
training_od_colab.ipynb 40KB
gen_train_data_npy.py 3KB
data_preprocessor.py 5KB
flickr2coco.py 4KB
README.md 4KB
config.py 873B
共 66 条
- 1
资源评论
极智视界
- 粉丝: 2w+
- 资源: 1556
下载权益
C知道特权
VIP文章
课程特权
开通VIP
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 数据库基本概念 ,常用数据库,sql server数据库,Access数据库,sql语句,数据库连接,数据库分页显示
- VBScript脚本语言概述,VBScript的对象和事件
- 甜蜜约会网页制作html.docx
- Python插入排序.docx
- 瑞萨RZ/V2H 四核视觉 AI 处理器REN-r01ds0429ej0100-rzv2h-DST-20231224.pdf
- Simulink三相两电平逆变器
- cesium源码,直接解压放到vscode中运行就行
- smpl2bvh smpl pkl 数据转 fbx
- 3dmax打开gltf/glb插件,1.4.2版本
- 安卓app开发期末大作业的个记账工具APP.zip
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功