**DE⫶TR**: End-to-End Object Detection with Transformers
========
[![Support Ukraine](https://img.shields.io/badge/Support-Ukraine-FFD500?style=flat&labelColor=005BBB)](https://opensource.fb.com/support-ukraine)
PyTorch training code and pretrained models for **DETR** (**DE**tection **TR**ansformer).
We replace the full complex hand-crafted object detection pipeline with a Transformer, and match Faster R-CNN with a ResNet-50, obtaining **42 AP** on COCO using half the computation power (FLOPs) and the same number of parameters. Inference in 50 lines of PyTorch.
![DETR](.github/DETR.png)
**What it is**. Unlike traditional computer vision techniques, DETR approaches object detection as a direct set prediction problem. It consists of a set-based global loss, which forces unique predictions via bipartite matching, and a Transformer encoder-decoder architecture.
Given a fixed small set of learned object queries, DETR reasons about the relations of the objects and the global image context to directly output the final set of predictions in parallel. Due to this parallel nature, DETR is very fast and efficient.
**About the code**. We believe that object detection should not be more difficult than classification,
and should not require complex libraries for training and inference.
DETR is very simple to implement and experiment with, and we provide a
[standalone Colab Notebook](https://colab.research.google.com/github/facebookresearch/detr/blob/colab/notebooks/detr_demo.ipynb)
showing how to do inference with DETR in only a few lines of PyTorch code.
Training code follows this idea - it is not a library,
but simply a [main.py](main.py) importing model and criterion
definitions with standard training loops.
Additionnally, we provide a Detectron2 wrapper in the d2/ folder. See the readme there for more information.
For details see [End-to-End Object Detection with Transformers](https://ai.facebook.com/research/publications/end-to-end-object-detection-with-transformers) by Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko.
See our [blog post](https://ai.facebook.com/blog/end-to-end-object-detection-with-transformers/) to learn more about end to end object detection with transformers.
# Model Zoo
We provide baseline DETR and DETR-DC5 models, and plan to include more in future.
AP is computed on COCO 2017 val5k, and inference time is over the first 100 val5k COCO images,
with torchscript transformer.
<table>
<thead>
<tr style="text-align: right;">
<th></th>
<th>name</th>
<th>backbone</th>
<th>schedule</th>
<th>inf_time</th>
<th>box AP</th>
<th>url</th>
<th>size</th>
</tr>
</thead>
<tbody>
<tr>
<th>0</th>
<td>DETR</td>
<td>R50</td>
<td>500</td>
<td>0.036</td>
<td>42.0</td>
<td><a href="https://dl.fbaipublicfiles.com/detr/detr-r50-e632da11.pth">model</a> | <a href="https://dl.fbaipublicfiles.com/detr/logs/detr-r50_log.txt">logs</a></td>
<td>159Mb</td>
</tr>
<tr>
<th>1</th>
<td>DETR-DC5</td>
<td>R50</td>
<td>500</td>
<td>0.083</td>
<td>43.3</td>
<td><a href="https://dl.fbaipublicfiles.com/detr/detr-r50-dc5-f0fb7ef5.pth">model</a> | <a href="https://dl.fbaipublicfiles.com/detr/logs/detr-r50-dc5_log.txt">logs</a></td>
<td>159Mb</td>
</tr>
<tr>
<th>2</th>
<td>DETR</td>
<td>R101</td>
<td>500</td>
<td>0.050</td>
<td>43.5</td>
<td><a href="https://dl.fbaipublicfiles.com/detr/detr-r101-2c7b67e5.pth">model</a> | <a href="https://dl.fbaipublicfiles.com/detr/logs/detr-r101_log.txt">logs</a></td>
<td>232Mb</td>
</tr>
<tr>
<th>3</th>
<td>DETR-DC5</td>
<td>R101</td>
<td>500</td>
<td>0.097</td>
<td>44.9</td>
<td><a href="https://dl.fbaipublicfiles.com/detr/detr-r101-dc5-a2e86def.pth">model</a> | <a href="https://dl.fbaipublicfiles.com/detr/logs/detr-r101-dc5_log.txt">logs</a></td>
<td>232Mb</td>
</tr>
</tbody>
</table>
COCO val5k evaluation results can be found in this [gist](https://gist.github.com/szagoruyko/9c9ebb8455610958f7deaa27845d7918).
The models are also available via torch hub,
to load DETR R50 with pretrained weights simply do:
```python
model = torch.hub.load('facebookresearch/detr:main', 'detr_resnet50', pretrained=True)
```
COCO panoptic val5k models:
<table>
<thead>
<tr style="text-align: right;">
<th></th>
<th>name</th>
<th>backbone</th>
<th>box AP</th>
<th>segm AP</th>
<th>PQ</th>
<th>url</th>
<th>size</th>
</tr>
</thead>
<tbody>
<tr>
<th>0</th>
<td>DETR</td>
<td>R50</td>
<td>38.8</td>
<td>31.1</td>
<td>43.4</td>
<td><a href="https://dl.fbaipublicfiles.com/detr/detr-r50-panoptic-00ce5173.pth">download</a></td>
<td>165Mb</td>
</tr>
<tr>
<th>1</th>
<td>DETR-DC5</td>
<td>R50</td>
<td>40.2</td>
<td>31.9</td>
<td>44.6</td>
<td><a href="https://dl.fbaipublicfiles.com/detr/detr-r50-dc5-panoptic-da08f1b1.pth">download</a></td>
<td>165Mb</td>
</tr>
<tr>
<th>2</th>
<td>DETR</td>
<td>R101</td>
<td>40.1</td>
<td>33</td>
<td>45.1</td>
<td><a href="https://dl.fbaipublicfiles.com/detr/detr-r101-panoptic-40021d53.pth">download</a></td>
<td>237Mb</td>
</tr>
</tbody>
</table>
Checkout our [panoptic colab](https://colab.research.google.com/github/facebookresearch/detr/blob/colab/notebooks/DETR_panoptic.ipynb)
to see how to use and visualize DETR's panoptic segmentation prediction.
# Notebooks
We provide a few notebooks in colab to help you get a grasp on DETR:
* [DETR's hands on Colab Notebook](https://colab.research.google.com/github/facebookresearch/detr/blob/colab/notebooks/detr_attention.ipynb): Shows how to load a model from hub, generate predictions, then visualize the attention of the model (similar to the figures of the paper)
* [Standalone Colab Notebook](https://colab.research.google.com/github/facebookresearch/detr/blob/colab/notebooks/detr_demo.ipynb): In this notebook, we demonstrate how to implement a simplified version of DETR from the grounds up in 50 lines of Python, then visualize the predictions. It is a good starting point if you want to gain better understanding the architecture and poke around before diving in the codebase.
* [Panoptic Colab Notebook](https://colab.research.google.com/github/facebookresearch/detr/blob/colab/notebooks/DETR_panoptic.ipynb): Demonstrates how to use DETR for panoptic segmentation and plot the predictions.
# Usage - Object detection
There are no extra compiled components in DETR and package dependencies are minimal,
so the code is very simple to use. We provide instructions how to install dependencies via conda.
First, clone the repository locally:
```
git clone https://github.com/facebookresearch/detr.git
```
Then, install PyTorch 1.5+ and torchvision 0.6+:
```
conda install -c pytorch pytorch torchvision
```
Install pycocotools (for evaluation on COCO) and scipy (for training):
```
conda install cython scipy
pip install -U 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'
```
That's it, should be good to train and evaluate detection models.
(optional) to work with panoptic install panopticapi:
```
pip install git+https://github.com/cocodataset/panopticapi.git
```
## Data preparation
Download and extract COCO 2017 train and val images with annotations from
[http://cocodataset.org](http://cocodataset.org/#download).
We expect the directory structure to be the following:
```
path/to/coco/
annotations/ # annotation json files
train2017/ # train images
val2017/ # val images
```
## Training
To train baseline DETR on a single node with 8 gpus for 300 epochs run:
```
p
没有合适的资源?快使用搜索试试~ 我知道了~
基于transformer的物体识别算法开发内含数据集和环境搭建教程.zip
共41个文件
py:28个
md:7个
yaml:2个
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
0 下载量 114 浏览量
2024-04-15
23:03:24
上传
评论
收藏 231KB ZIP 举报
温馨提示
复现transformer的算法,可以直接运行。内含预训练模型
资源推荐
资源详情
资源评论
收起资源包目录
基于transformer的物体识别算法开发内含数据集和环境搭建教程.zip (41个子文件)
test_all.py 9KB
.github
ISSUE_TEMPLATE
bugs.md 725B
unexpected-problems-bugs.md 1KB
questions-help-support.md 791B
CONTRIBUTING.md 2KB
CODE_OF_CONDUCT.md 244B
DETR.png 172KB
main.py 11KB
hubconf.py 6KB
tox.ini 65B
datasets
__init__.py 897B
coco.py 5KB
panoptic_eval.py 1KB
transforms.py 8KB
coco_panoptic.py 4KB
coco_eval.py 9KB
Dockerfile 328B
requirements.txt 224B
models
__init__.py 143B
segmentation.py 15KB
position_encoding.py 3KB
matcher.py 4KB
backbone.py 4KB
detr.py 17KB
transformer.py 12KB
engine.py 6KB
util
__init__.py 71B
box_ops.py 3KB
misc.py 15KB
plot_utils.py 4KB
README.md 12KB
d2
detr
__init__.py 176B
dataset_mapper.py 4KB
detr.py 11KB
config.py 888B
configs
detr_256_6_6_torchvision.yaml 1012B
detr_segm_256_6_6_torchvision.yaml 1KB
converter.py 3KB
README.md 2KB
train_net.py 5KB
run_with_submitit.py 3KB
共 41 条
- 1
资源评论
小码蚁.
- 粉丝: 2526
- 资源: 4089
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功