# TensorRT_Inference_Demo
<div align="center">
<img src="assets/000000005001.jpg" height="160px" width="180px" >
<img src="assets/000000142324.jpg" height="160px" width="180px" >
<img src="assets/000000007816.jpg" height="160px" width="180px" >
<img src="assets/000000017899.jpg" height="160px" width="150px" >
<img src="assets/000000157807.jpg" height="160px" width="180px" >
<img src="assets/000000294695.jpg" height="160px"
width="180px" >
<img src="assets/000000579158.jpg" height="160px"
width="180px" >
<img src="assets/000000007977.jpg" height="160px" width="150px" >
</div>
<div align="center">
[![Cuda](https://img.shields.io/badge/CUDA-11.3-%2376B900?logo=nvidia)](https://developer.nvidia.com/cuda-toolkit-archive)
[![](https://img.shields.io/badge/TensorRT-8.6.0.12-%2376B900.svg?style=flat&logo=tensorrt)](https://developer.nvidia.com/nvidia-tensorrt-8x-download)
[![](https://img.shields.io/badge/ubuntu-20.04-orange.svg?style=flat&logo=ubuntu)](https://releases.ubuntu.com/20.04/)
</div>
## 1.Introduction
This repo use TensorRT-8.x to deploy well-trained models, both image preprocessing and postprocessing are performed with CUDA, which realizes high-speed inference.
## 2.Update
<details open>
<summary>update process</summary>
+ 2023.05.01 ð Create the repo.
+ 2023.05.03 ð Support yolov5 detection.
+ 2023.05.05 ð Support yolov7 and yolov5 instance-segmentation.
+ 2023.05.10 ð Support yolov8 detection and instance-segmentation.
+ 2023.05.12 ð Support cuda preprocess for speed up.
+ 2023.05.16 ð Support cuda box postprocess.
+ 2023.05.19 ð Support cuda mask postprocess and support rtdetr.
+ 2023.05.21 ð Support yolov6.
+ 2023.05.26 ð Support dynamic batch inference.
+ 2023.06.07 ð Support yolox and yolo-nas.
</details>
## 3.Support Models
<details open>
<summary>supported models</summary>
- [x] [YOLOv5](https://github.com/ultralytics/yolov5)<br>
- [x] [YOLOv5-seg](https://github.com/ultralytics/yolov5)<br>
- [x] [YOLOv6](https://github.com/meituan/YOLOv6)<br>
- [x] [YOLOv7](https://github.com/WongKinYiu/yolov7)<br>
- [x] [YOLOv8](https://github.com/ultralytics/ultralytics)<br>
- [x] [YOLOv8-seg](https://github.com/ultralytics/ultralytics)<br>
- [x] [YOLOX](https://github.com/Megvii-BaseDetection/YOLOX)<br>
- [x] [RT-DETR](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/rtdetr)<br>
- [x] [YOLO-NAS](https://github.com/Deci-AI/super-gradients)<br>
</details>
All speed tests were performed on RTX 3090 with COCO Val set.The time calculated here is the sum of the time of image loading, preprocess, inference and postprocess, so it's going to be slower than what's reported in the paper.
<div align='center'>
| Models | BatchSize | Mode | Resolution | FPS |
|-|-|:-:|:-:|:-:|
| YOLOv5-s v7.0 | 1 | FP32 | 640x640 | 200 |
| YOLOv5-s v7.0 | 32 | FP32 | 640x640 | 246 |
| YOLOv5-seg-s v7.0 | 1 | FP32 | 640x640 | 155 |
| YOLOv6-s v3 | 1 | FP32 | 640x640 | 163 |
| YOLOv7 | 1 | FP32 | 640x640 | 107 |
| YOLOv8-s | 1 | FP32 | 640x640 | 171 |
| YOLOv8-seg-s | 1 | FP32 | 640x640 | 122 |
| YOLOX-s | 1 | FP32 | 640x640 | 156 |
| YOLO-NAS-s | 1 | FP32 | 640x640 | 165 |
| RT-DETR | 1 | FP32 | 640x640 | 106 |
</div>
## 4.Usage
1. Clone the repo.
```
git clone https://github.com/Li-Hongda/TensorRT_Inference_Demo.git
```
2. Install the dependencies.
### TensorRT
Following [NVIDIA offical docs](https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html#installing) to install TensorRT.
### yaml-cpp
```
git clone https://github.com/jbeder/yaml-cpp
mkdir build && cd build
cmake ..
make -j20
cmake -DYAML_BUILD_SHARED_LIBS=on ..
make -j20
cd ..
```
3. Change the path [here](https://github.com/Li-Hongda/TensorRT_Inference_Demo/blob/main/object_detection/CMakeLists.txt#L19) to your TensorRT path, and [here](https://github.com/Li-Hongda/TensorRT_Inference_Demo/blob/main/object_detection/CMakeLists.txt#L11) to your CUDA path. Then,
```
cd TensorRT_Inference_Demo/object_detection
mkdir build && cd build
cmake ..
make -j$(nproc)
```
4. Get the ONNX model from the official repository and put them in `weights/MODEL_NAME`. Then modify the configuration file in `configs`.Take yolov5 as an example:
```
python export.py --weights=yolov5s.pt --dynamic --simplify --include=onnx --opset 11
```
5. The executable file will be generated in `bin` in the repo directory if compile successfully.Then enjoy yourself with command like this:
```
cd bin
./object_detection yolov5 /path/to/input/dir
```
> Notes:
> 1. The output of the model is required for post-processing is num_bboxes (imageHeight x imageWidth) x num_pred(num_cls + coordinates + confidence),while the output of YOLOv8 is num_pred x num_bboxes,which means the predicted values of the same box are not contiguous in memory.For convenience, the corresponding dimensions of the original pytorch output need to be transposed when exporting to ONNX model.
> 2. The dynamic shape engine is convenient but sacrifices some inference speed compared with the static model of the same batchsize.Therefore, if you want to pursue faster inference speed, it is better to export the ONNX model of fixed batchsize, such as batchsize 32.
## 5.Results
Bilibili Demo: [![](https://img.shields.io/badge/bilibili-blue.svg?logo=bilibili)](https://www.bilibili.com/video/BV1Th4y1d7z3/?spm_id_from=333.999.0.0&vd_source=bd091b2fb1789d450ff2736f81a6912a)
## 6.Reference
[0].https://github.com/NVIDIA/TensorRT<br>
[1].https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#c_topics<br>
[2].https://github.com/linghu8812/tensorrt_inference<br>
[3].https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#<br>
[4].https://blog.csdn.net/bobchen1017?type=blog<br>
没有合适的资源?快使用搜索试试~ 我知道了~
yolov8系列--A repo that uses TensorRT to deploy wll-trained .zip
共66个文件
cpp:21个
h:14个
yaml:10个
需积分: 5 0 下载量 121 浏览量
2024-02-24
21:44:59
上传
评论
收藏 1.12MB ZIP 举报
温馨提示
yolov8系列--A repo that uses TensorRT to deploy wll-trained
资源推荐
资源详情
资源评论
收起资源包目录
yolov8系列--A repo that uses TensorRT to deploy wll-trained .zip (66个子文件)
kwan1120
include
basemodel.h 1KB
yolonas.h 250B
yolo.h 813B
yolov7.h 148B
instance_segmentation.h 1KB
yolox.h 146B
common.h 962B
rtdetr.h 401B
yolov8.h 471B
build.h 259B
yolov6.h 150B
detection.h 5KB
cuda_function.h 2KB
yolov5.h 247B
assets
000000142324.jpg 169KB
000000017899.jpg 171KB
000000157807.jpg 80KB
000000579158.jpg 144KB
000000294695.jpg 115KB
000000007816.jpg 130KB
000000007977.jpg 150KB
000000005001.jpg 158KB
src
basemodel.cpp 6KB
common.cpp 3KB
yolov5.cpp 152B
yolo.cpp 6KB
yolonas.cpp 1KB
rtdetr.cpp 3KB
yolov6.cpp 223B
detection.cpp 7KB
cuda_function.cu 20KB
build.cpp 1KB
yolox.cpp 220B
instance_segmentation.cpp 10KB
yolov8.cpp 3KB
yolov7.cpp 80B
LICENSE 1KB
configs
yolov8-seg.yaml 427B
yolov8.yaml 415B
yolov7-p6.yaml 426B
rtdetr.yaml 405B
yolov7.yaml 413B
yolov5-seg.yaml 428B
yolonas.yaml 422B
yolov6.yaml 415B
yolov5.yaml 415B
yolox.yaml 412B
object_detection
CMakeLists.txt 3KB
yolov7
CMakeLists.txt 915B
main.cpp 496B
yolov8
CMakeLists.txt 971B
main.cpp 496B
rtdetr
CMakeLists.txt 875B
main.cpp 495B
yolov6
CMakeLists.txt 915B
main.cpp 496B
main.cpp 740B
yolov5
CMakeLists.txt 971B
main.cpp 496B
yolox
CMakeLists.txt 911B
main.cpp 488B
yolonas
CMakeLists.txt 919B
export.py 3KB
main.cpp 504B
.gitignore 318B
README.md 6KB
共 66 条
- 1
资源评论
Kwan的解忧杂货铺
- 粉丝: 1w+
- 资源: 3661
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功