# Multispectral-Object-Detection
[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/cross-modality-fusion-transformer-for/multispectral-object-detection-on-flir)](https://paperswithcode.com/sota/multispectral-object-detection-on-flir?p=cross-modality-fusion-transformer-for)
[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/cross-modality-fusion-transformer-for/pedestrian-detection-on-llvip)](https://paperswithcode.com/sota/pedestrian-detection-on-llvip?p=cross-modality-fusion-transformer-for)
[![New](https://img.shields.io/badge/2021-NEW-brightgreen.svg)](https://github.com/DocF/multispectral-object-detection/)
![Visitors](https://visitor-badge.glitch.me/badge?page_id=DocF.multispectral-object-detection)
[![GitHub stars](https://img.shields.io/github/stars/DocF/multispectral-object-detection.svg?style=social&label=Stars)](https://github.com/DocF/multispectral-object-detection)
## Intro
Official Code for [Cross-Modality Fusion Transformer for Multispectral Object Detection](https://arxiv.org/abs/2111.00273).
Multispectral Object Detection with Transformer and Yolov5
## Abstract
Multispectral image pairs can provide the combined information, making object detection applications more reliable and robust in the open world.
To fully exploit the different modalities, we present a simple yet effective cross-modality feature fusion approach, named Cross-Modality Fusion Transformer (CFT) in this paper.
Unlike prior CNNs-based works, guided by the Transformer scheme, our network learns long-range dependencies and integrates global contextual information in the feature extraction stage.
More importantly, by leveraging the self attention of the Transformer, the network can naturally carry out simultaneous intra-modality and inter-modality fusion, and robustly capture the latent interactions between RGB and Thermal domains, thereby significantly improving the performance of multispectral object detection.
Extensive experiments and ablation studies on multiple datasets demonstrate that our approach is effective and achieves state-of-the-art detection performance.
### Demo
**Night Scene**
<div align="left">
<img src="https://github.com/DocF/multispectral-object-detection/blob/main/video/demo1.gif" width="600">
</div>
**Day Scene**
<div align="left">
<img src="https://github.com/DocF/multispectral-object-detection/blob/main/video/demo.gif" width="600">
</div>
### Overview
<div align="left">
<img src="https://github.com/DocF/multispectral-object-detection/blob/main/cft.png" width="800">
</div>
## Citation
If you use this repo for your research, please cite our paper:
```
@article{fang2021cross,
title={Cross-Modality Fusion Transformer for Multispectral Object Detection},
author={Fang Qingyun and Han Dapeng and Wang Zhaokui},
journal={arXiv preprint arXiv:2111.00273},
year={2021}
}
```
## Installation
Python>=3.6.0 is required with all requirements.txt installed including PyTorch>=1.7 (The same as yolov5 https://github.com/ultralytics/yolov5 ).
#### Clone the repo
git clone https://github.com/DocF/multispectral-object-detection
#### Install requirements
```bash
$ cd multispectral-object-detection
$ pip install -r requirements.txt
```
## Dataset
-[FLIR] [[Google Drive]](http://shorturl.at/ahAY4) [[Baidu Drive]](https://pan.baidu.com/s/1z2GHVD3WVlGsVzBR1ajSrQ?pwd=qwer) ```extraction code:qwer```
A new aligned version.
-[LLVIP] [download](https://github.com/bupt-ai-cz/LLVIP)
-[VEDAI] [download](https://downloads.greyc.fr/vedai/)
You need to convert all annotations to YOLOv5 format.
Refer: https://github.com/ultralytics/yolov5/wiki/Train-Custom-Data
## Run
#### Download the pretrained weights
yolov5 weights (pre-train)
-[yolov5s] [google drive](https://drive.google.com/file/d/1UGAsaOvV7jVrk0RvFVYL6Vq0K7NQLD8H/view?usp=sharing)
-[yolov5m] [google drive](https://drive.google.com/file/d/1qB7L2vtlGppGjHp5xpXCKw14YHhbV4s1/view?usp=sharing)
-[yolov5l] [google drive](https://drive.google.com/file/d/12OFGLF73CqTgOCMJAycZ8lB4eW19D0nb/view?usp=sharing)
-[yolov5x] [google drive](https://drive.google.com/file/d/1e9xiQImx84KFQ_a7XXpn608I3rhRmKEn/view?usp=sharing)
CFT weights
-[LLVIP] [google drive](https://drive.google.com/file/d/18yLDUOxNXQ17oypQ-fAV9OS9DESOZQtV/view?usp=sharing)
-[FLIR] [google drive](https://drive.google.com/file/d/1PwEOgT5ZOTjoKT2LpOzvCsxsVgwP8NIJ/view)
#### Change the data cfg
some example in data/multispectral/
#### Change the model cfg
some example in models/transformer/
note!!! we used xxxx_transfomerx3_dataset.yaml in our paper.
### Train Test and Detect
train: ``` python train.py```
test: ``` python test.py```
detect: ``` python detect_twostream.py```
## Results
|Dataset|CFT|mAP50|mAP75|mAP|
|:---------: |------------|:-----:|:-----------------:|:-------------:|
|FLIR||73.0|32.0|37.4|
|FLIR| ✔️ |**78.7 (Δ5.7)**|**35.5 (Δ3.5)**|**40.2 (Δ2.8)**|
|LLVIP||95.8|71.4|62.3|
|LLVIP| ✔️ |**97.5 (Δ1.7)**|**72.9 (Δ1.5)**|**63.6 (Δ1.3)**|
|VEDAI||79.7 | 47.7 | 46.8
|VEDAI| ✔️ |**85.3 (Δ5.6)**|**65.9(Δ18.2)**|**56.0 (Δ9.2)**|
### LLVIP
Log Average Miss Rate
|Model| Log Average Miss Rate |
|:---------: |:--------------:|
|YOLOv3-RGB|37.70%|
|YOLOv3-IR|17.73%|
|YOLOv5-RGB|22.59%|
|YOLOv5-IR|10.66%|
|Baseline(Ours)|**6.91%**|
|CFT(Ours)|**5.40%**|
Miss Rate - FPPI curve
<div align="left">
<img src="https://github.com/DocF/multispectral-object-detection/blob/main/MR.png" width="500">
</div>
#### References
https://github.com/ultralytics/yolov5
没有合适的资源?快使用搜索试试~ 我知道了~
温馨提示
YOLO(You Only Look Once)是一种实时目标检测算法,它通过将目标检测任务转化为一个回归问题,实现了在一次前向传播中同时预测目标的类别和位置。YOLO系列网络的改进仓库项目主要是对YOLO算法进行改进和优化,以提高目标检测的准确性和速度。 以下是几个YOLO系列网络的改进仓库项目的简介: 1. YOLOv2:YOLOv2是YOLO的第二个版本,通过引入Darknet-19网络作为特征提取网络,使用anchor boxes来预测不同尺度的目标框,以及采用多尺度训练和测试等技术,提升了YOLO的检测性能。 2. YOLOv3:YOLOv3是YOLO的第三个版本,它在YOLOv2的基础上进行了一系列改进。其中包括使用更深的Darknet-53网络作为特征提取网络,引入了FPN(Feature Pyramid Network)结构来处理不同尺度的特征图,以及使用更多的anchor boxes等。这些改进使得YOLOv3在准确性和速度方面都有了显著提升。 3. YOLOv4:YOLOv4是YOLO的第四个版本,它在YOLOv3的基础上进一步改进了目标检测的性能。
资源推荐
资源详情
资源评论
收起资源包目录
YOLO系列网络的改进仓库 (168个子文件)
Dockerfile 821B
.DS_Store 6KB
.gitignore 50B
multispectral-object-detection-main.iml 497B
bus.jpg 476KB
zidane.jpg 165KB
.keep 0B
.keep 0B
.keep 0B
.keep 0B
.keep 0B
.keep 0B
.keep 0B
.keep 0B
.keep 0B
.keep 0B
.keep 0B
.keep 0B
.keep 0B
.keep 0B
.keep 0B
.keep 0B
.keep 0B
.keep 0B
.keep 0B
.keep 0B
.keep 0B
.keep 0B
LICENSE 34KB
LICENSE 9KB
README.md 5KB
README.md 1KB
EISCI论文格式.md 1KB
README.md 930B
README.en.md 841B
datasets.py 88KB
train.py 56KB
common.py 32KB
yolo_test.py 28KB
general.py 27KB
plots.py 20KB
test.py 18KB
val.py 17KB
wandb_utils.py 16KB
yolo.py 13KB
torch_utils.py 12KB
detect_twostream.py 11KB
loss.py 9KB
autoanchor.py 9KB
metrics.py 9KB
export.py 6KB
hubconf.py 6KB
ds_fusion.py 6KB
experimental.py 5KB
google_utils.py 5KB
activations.py 4KB
gradcam.py 2KB
callbacks.py 2KB
split train val.py 2KB
Check category.py 2KB
resume.py 1KB
restapi.py 1KB
log_dataset.py 800B
global_var.py 462B
example_request.py 299B
__init__.py 1B
__init__.py 1B
__init__.py 1B
__init__.py 1B
datasets.cpython-37.pyc 51KB
datasets.cpython-39.pyc 49KB
general.cpython-39.pyc 22KB
general.cpython-37.pyc 22KB
plots.cpython-39.pyc 17KB
plots.cpython-37.pyc 16KB
torch_utils.cpython-39.pyc 11KB
test.cpython-37.pyc 11KB
test.cpython-37.pyc 11KB
torch_utils.cpython-37.pyc 11KB
test.cpython-39.pyc 11KB
test.cpython-39.pyc 11KB
wandb_utils.cpython-39.pyc 11KB
wandb_utils.cpython-37.pyc 11KB
autoanchor.cpython-37.pyc 8KB
metrics.cpython-37.pyc 7KB
metrics.cpython-39.pyc 7KB
autoanchor.cpython-39.pyc 7KB
loss.cpython-37.pyc 6KB
loss.cpython-39.pyc 6KB
ds_fusion.cpython-37.pyc 4KB
google_utils.cpython-39.pyc 3KB
google_utils.cpython-37.pyc 3KB
ds_fusion.cpython-37.pyc 3KB
ds_fusion.cpython-37.pyc 3KB
callbacks.cpython-39.pyc 2KB
full_arrange.cpython-37.pyc 2KB
full_arrange.cpython-37.pyc 2KB
gradcam.cpython-37.pyc 2KB
global_var.cpython-39.pyc 569B
global_var.cpython-39.pyc 569B
共 168 条
- 1
- 2
资源评论
电子小芯
- 粉丝: 818
- 资源: 37
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功