<div align="left">
## You Only Look at Once for Real-time and Generic Multi-Task
This repository(Yolov8 multi-task) is the official PyTorch implementation of the paper "You Only Look at Once for Real-time and Generic Multi-Task".
> [**You Only Look at Once for Real-time and Generic Multi-Task**](https://arxiv.org/pdf/2310.01641.pdf)
>
> by Jiayuan Wang, [Q. M. Jonathan Wu](https://scholar.google.com/citations?user=BJSAsE8AAAAJ&hl=zh-CN)<sup> :email:</sup> and [Ning Zhang](https://scholar.google.ca/citations?hl=zh-CN&user=ZcYihtoAAAAJ)
>
> (<sup>:email:</sup>) corresponding author.
>
> *arXiv technical report ([arXiv 2310.01641](https://arxiv.org/pdf/2310.01641.pdf))*
---
### Update:
Nov.5: We revised the structure of the segmentation neck. We will update the manuscript soon. Please use this repository model structure to learn, this matches with the code.
### The Illustration of A-YOLOM
![YOLOv8-multi-task](pictures/constructure.jpg)
### Contributions
* We have developed a lightweight model capable of integrating three tasks into a single unified model. This is particularly beneficial for multi-task that demand real-time processing.
* We have designed a novel Adaptive Concatenate Module specifically for the neck region of segmentation architectures. This module can adaptively concatenate features without manual design, further enhancing the model's generality.
* We designed a lightweight, simple, and generic segmentation head. We have a unified loss function for the same type of task head, meaning we don't need to custom design for specific tasks. It is only built by a series of convolutional layers.
* Extensive experiments are conducted based on publicly accessible autonomous driving datasets, which demonstrate that our model can outperform existing works, particularly in terms of inference time and visualization. Moreover, we further conducted experiments using real road datasets, which also demonstrate that our model significantly outperformed the state-of-the-art approaches.
### Results
#### Parameters and speed
| Model | Parameters | FPS (bs=1) | FPS (bs=32) |
|----------------|-------------|------------|-------------|
| YOLOP | 7.9M | 26.0 | 134.8 |
| HybridNet | 12.83M | 11.7 | 26.9 |
| YOLOv8n(det) | 3.16M | 102 | 802.9 |
| YOLOv8n(seg) | 3.26M | 82.55 | 610.49 |
| A-YOLOM(n) | 4.43M | 39.9 | 172.2 |
| A-YOLOM(s) | 13.61M | 39.7 | 96.2 |
#### Traffic Object Detection Result
| Model | Recall (%) | mAP50 (%) |
|-------------|------------|------------|
| MultiNet | 81.3 | 60.2 |
| DLT-Net | **89.4** | 68.4 |
| Faster R-CNN| 81.2 | 64.9 |
| YOLOv5s | 86.8 | 77.2 |
| YOLOv8n(det)| 82.2 | 75.1 |
| YOLOP | 88.6 | 76.5 |
| A-YOLOM(n) | 85.3 | 78.0 |
| A-YOLOM(s) | 86.9 | **81.1** |
#### Drivable Area Segmentation Result
| Model | mIoU (%) |
|----------------|----------|
| MultiNet | 71.6 |
| DLT-Net | 72.1 |
| PSPNet | 89.6 |
| YOLOv8n(seg) | 78.1 |
| YOLOP | **91.6** |
| A-YOLOM(n) | 90.5 |
| A-YOLOM(s) | 91.0 |
#### Lane Detection Result:
| Model | Accuracy (%) | IoU (%) |
|----------------|--------------|---------|
| Enet | N/A | 14.64 |
| SCNN | N/A | 15.84 |
| ENet-SAD | N/A | 16.02 |
| YOLOv8n(seg) | 80.5 | 22.9 |
| YOLOP | 84.8 | 26.5 |
| A-YOLOM(n) | 81.3 | 28.2 |
| A-YOLOM(s) | **84.9** | **28.8** |
#### Ablation Studies 1: Adaptive concatenation module:
| Training method | Recall (%) | mAP50 (%) | mIoU (%) | Accuracy (%) | IoU (%) |
|-----------------|------------|-----------|----------|--------------|---------|
| YOLOM(n) | 85.2 | 77.7 | 90.6 | 80.8 | 26.7 |
| A-YOLOM(n) | 85.3 | 78 | 90.5 | 81.3 | 28.2 |
| YOLOM(s) | 86.9 | 81.1 | 90.9 | 83.9 | 28.2 |
| A-YOLOM(s) | 86.9 | 81.1 | 91 | 84.9 | 28.8 |
#### Ablation Studies 2: Results of different Multi-task model and segmentation structure:
| Model | Parameters | mIoU (%) | Accuracy (%) | IoU (%) |
|----------------|------------|----------|--------------|---------|
| YOLOv8(segda) | 1004275 | 78.1 | - | - |
| YOLOv8(segll) | 1004275 | - | 80.5 | 22.9 |
| YOLOv8(multi) | 2008550 | 84.2 | 81.7 | 24.3 |
| YOLOM(n) | 15880 | 90.6 | 80.8 | 26.7 |
YOLOv8(multi) and YOLOM(n) only display two segmentation head parameters in total. They indeed have three heads, we ignore the detection head parameters because this is an ablation study for segmentation structure.
**Notes**:
- The works we has use for reference including `Multinet` ([paper](https://arxiv.org/pdf/1612.07695.pdf?utm_campaign=affiliate-ir-Optimise%20media%28%20South%20East%20Asia%29%20Pte.%20ltd._156_-99_national_R_all_ACQ_cpa_en&utm_content=&utm_source=%20388939),[code](https://github.com/MarvinTeichmann/MultiNet)),`DLT-Net` ([paper](https://ieeexplore.ieee.org/abstract/document/8937825)),`Faster R-CNN` ([paper](https://proceedings.neurips.cc/paper/2015/file/14bfa6bb14875e45bba028a21ed38046-Paper.pdf),[code](https://github.com/ShaoqingRen/faster_rcnn)),`YOLOv5s`([code](https://github.com/ultralytics/yolov5)) ,`PSPNet`([paper](https://openaccess.thecvf.com/content_cvpr_2017/papers/Zhao_Pyramid_Scene_Parsing_CVPR_2017_paper.pdf),[code](https://github.com/hszhao/PSPNet)) ,`ENet`([paper](https://arxiv.org/pdf/1606.02147.pdf),[code](https://github.com/osmr/imgclsmob)) `SCNN`([paper](https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/download/16802/16322),[code](https://github.com/XingangPan/SCNN)) `SAD-ENet`([paper](https://openaccess.thecvf.com/content_ICCV_2019/papers/Hou_Learning_Lightweight_Lane_Detection_CNNs_by_Self_Attention_Distillation_ICCV_2019_paper.pdf),[code](https://github.com/cardwing/Codes-for-Lane-Detection)), `YOLOP`([paper](https://link.springer.com/article/10.1007/s11633-022-1339-y),[code](https://github.com/hustvl/YOLOP)), `HybridNets`([paper](https://arxiv.org/abs/2203.09035),[code](https://github.com/datvuthanh/HybridNets)), `YOLOv8`([code](https://github.com/ultralytics/ultralytics)). Thanks for their wonderful works.
---
### Visualization
#### Real Road
![Real Rold](pictures/real-road.png)
---
### Requirement
This codebase has been developed with [**Python==3.7.16**](https://www.python.org/) with [**PyTorch==1.13.1**](https://pytorch.org/get-started/locally/).
You can use a 1080Ti GPU with 16 batch sizes. That will be fine. Only need more time to train. We recommend using a 4090 or more powerful GPU, which will be fast.
We strongly recommend you create a pure environment and follow our instructions to build yours. Otherwise, you may encounter some issues because the YOLOv8 has many mechanisms to detect your environment package automatically. Then it will change some variable values to further affect the code running.
```setup
cd YOLOv8-multi-task
pip install -e .
```
### Data preparation and Pre-trained model
#### Download
- Download the images from [images](https://bdd-data.berkeley.edu/).
- Pre-trained model: [A-YOLOM](https://uwin365-my.sharepoint.com/:f:/g/personal/wang621_uwindsor_ca/EnoHyXIbTGFDjv1KLccuvrsBWLz6R4_TNVxErMukwCL0mw?e=I8WcKc) # which include two version, scale "n" and "s".
- Download the annotations of detection from [detection-object](https://uwin365-my.sharepoint.com/:u:/g/personal/wang621_uwindsor_ca/EflGScMT-D1MqBTTYUSMdaEBT1wWm5uB8BausmS7fDLsQQ?e=cb7age).
- Download the annotations of drivable area segmentation from [seg-drivable-10](https:
没有合适的资源?快使用搜索试试~ 我知道了~
温馨提示
自动驾驶,将三个任务集成到一个统一的模型中,具有可视化脚本 数据集:BDD100k,对象边界框,可驾驶区域,车道标记和全帧实例分割 names: 0: person 1: rider 2: car 3: bus 4: truck 5: bike 6: motor 7: traffic light 8: traffic sign 9: train 10: drivable # Add drivable class for drivable segmentation 11: lane # Add lane class for lane segmentation 添加了检测到特定对象进行语音告警功能,
资源推荐
资源详情
资源评论
收起资源包目录
自动驾驶:车道线检测+行人、车辆、交通标志识别+可行驶区域识别+语音告警 (1387个子文件)
python3.7 12.4MB
pip3.7 246B
pip-3.7 246B
wheel-3.7 233B
wheel3.7 233B
activate 2KB
CITATION.cff 612B
setup.cfg 2KB
pyvenv.cfg 327B
activate.csh 1KB
t64-arm.exe 179KB
w64-arm.exe 165KB
gui-arm64.exe 135KB
cli-arm64.exe 134KB
t64.exe 106KB
w64.exe 100KB
t32.exe 96KB
w32.exe 90KB
gui-64.exe 74KB
cli-64.exe 73KB
cli-32.exe 64KB
cli.exe 64KB
gui-32.exe 64KB
gui.exe 64KB
activate.fish 3KB
.gitignore 50B
YOLOv8-multi-task.iml 609B
MANIFEST.in 213B
INSTALLER 4B
INSTALLER 4B
INSTALLER 4B
create_toy_dataset-checkpoint.ipynb 4KB
create_toy_dataset.ipynb 4KB
output_4.jpg 477KB
4.jpg 411KB
1.jpg 206KB
6.jpg 172KB
5.jpg 168KB
4.jpg 136KB
bus.jpg 134KB
3.jpg 73KB
2.jpg 68KB
zidane.jpg 49KB
LICENSE 34KB
LICENSE 1KB
README.md 15KB
CONTRIBUTING.md 5KB
README.md 2KB
README.md 2KB
METADATA 6KB
METADATA 4KB
METADATA 2KB
pai.mp3 84KB
deng.mp3 58KB
ren.mp3 58KB
01.mp3 33KB
1.mp4 5.04MB
2.mp4 4.68MB
2.mp4 256KB
2.mp4 256KB
2.mp4 44B
2.mp4 44B
2.mp4 44B
2.mp4 44B
2.mp4 44B
2.mp4 44B
2.mp4 44B
2.mp4 44B
2.mp4 44B
2.mp4 44B
activate.nu 3KB
deactivate.nu 682B
cacert.pem 280KB
pip 246B
pip3 246B
PKG-INFO 25KB
activate.ps1 2KB
best.pt 28.42MB
v4s.pt 26.25MB
yolov8n.pt 6.23MB
distutils-precedence.pth 151B
_virtualenv.pth 18B
fastjsonschema_validations.py 264KB
core.py 208KB
core.py 208KB
core.py 208KB
uts46data.py 202KB
_emoji_codes.py 137KB
more.py 129KB
langrussianmodel.py 125KB
more.py 115KB
__init__.py 106KB
__init__.py 106KB
langbulgarianmodel.py 102KB
langthaimodel.py 100KB
langhungarianmodel.py 99KB
langgreekmodel.py 96KB
langhebrewmodel.py 96KB
console.py 94KB
langturkishmodel.py 93KB
共 1387 条
- 1
- 2
- 3
- 4
- 5
- 6
- 14
资源评论
刘丶小歪
- 粉丝: 49
- 资源: 2
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- STM8L101F3P6单片机+CC1100模块433M遥控器设计硬件(原理图+PCB)工程文件.zip
- 上传下载铁人下载系统 Liuxing 1.0-liuxing1.0.rar
- 南京邮电大学数学实验实力雄厚,凭借其优秀的师资力量、丰富的实践教学资源和卓越的科研成果,成为国内一流的数学实验教学和科研基地
- 【火爆朋友圈的今天吃什么源码 v1.0】随机的为用户带来每一天的用餐选择和推荐.rar
- MPU6050中文版数据手册
- 上传下载手机电影下载-mobiledy.rar
- 响应式旅游网站源码下载 马尔代夫旅游网站.rar
- CMS小涴熊漫画连载系统漫画网站源码 带采集API.rar
- 福袋点点.apk
- 基于STM32的电子秤采用0.96寸OLED显示UI界面源码.zip
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功