# torch2trt
torch2trt is a PyTorch to TensorRT converter which utilizes the
TensorRT Python API. The converter is
* Easy to use - Convert modules with a single function call ``torch2trt``
* Easy to extend - Write your own layer converter in Python and register it with ``@tensorrt_converter``
If you find an issue, please [let us know](../..//issues)!
> Please note, this converter has limited coverage of TensorRT / PyTorch. We created it primarily
> to easily optimize the models used in the [JetBot](https://github.com/NVIDIA-AI-IOT/jetbot) project. If you find the converter helpful with other models, please [let us know](../..//issues).
## Usage
Below are some usage examples, for more check out the [notebooks](notebooks).
### Convert
```python
import torch
from torch2trt import torch2trt
from torchvision.models.alexnet import alexnet
# create some regular pytorch model...
model = alexnet(pretrained=True).eval().cuda()
# create example data
x = torch.ones((1, 3, 224, 224)).cuda()
# convert to TensorRT feeding sample data as input
model_trt = torch2trt(model, [x])
```
### Execute
We can execute the returned ``TRTModule`` just like the original PyTorch model
```python
y = model(x)
y_trt = model_trt(x)
# check the output against PyTorch
print(torch.max(torch.abs(y - y_trt)))
```
### Save and load
We can save the model as a ``state_dict``.
```python
torch.save(model_trt.state_dict(), 'alexnet_trt.pth')
```
We can load the saved model into a ``TRTModule``
```python
from torch2trt import TRTModule
model_trt = TRTModule()
model_trt.load_state_dict(torch.load('alexnet_trt.pth'))
```
## Models
We tested the converter against these models using the [test.sh](test.sh) script. You can generate the results by calling
```bash
./test.sh TEST_OUTPUT.md
```
> The results below show the throughput in FPS. You can find the raw output, which includes latency, in the [benchmarks folder](benchmarks).
| Model | Nano (PyTorch) | Nano (TensorRT) | Xavier (PyTorch) | Xavier (TensorRT) |
|-------|:--------------:|:---------------:|:----------------:|:-----------------:|
| alexnet | 46.4 | 69.9 | 250 | 580 |
| squeezenet1_0 | 44 | 137 | 130 | 890 |
| squeezenet1_1 | 76.6 | 248 | 132 | 1390 |
| resnet18 | 29.4 | 90.2 | 140 | 712 |
| resnet34 | 15.5 | 50.7 | 79.2 | 393 |
| resnet50 | 12.4 | 34.2 | 55.5 | 312 |
| resnet101 | 7.18 | 19.9 | 28.5 | 170 |
| resnet152 | 4.96 | 14.1 | 18.9 | 121 |
| densenet121 | 11.5 | 41.9 | 23.0 | 168 |
| densenet169 | 8.25 | 33.2 | 16.3 | 118 |
| densenet201 | 6.84 | 25.4 | 13.3 | 90.9 |
| densenet161 | 4.71 | 15.6 | 17.2 | 82.4 |
| vgg11 | 8.9 | 18.3 | 85.2 | 201 |
| vgg13 | 6.53 | 14.7 | 71.9 | 166 |
| vgg16 | 5.09 | 11.9 | 61.7 | 139 |
| vgg19 | | | 54.1 | 121 |
| vgg11_bn | 8.74 | 18.4 | 81.8 | 201 |
| vgg13_bn | 6.31 | 14.8 | 68.0 | 166 |
| vgg16_bn | 4.96 | 12.0 | 58.5 | 140 |
| vgg19_bn | | | 51.4 | 121 |
## Setup
> Note: torch2trt depends on the TensorRT Python API. On Jetson, this is included with the latest JetPack. For desktop, please follow the [TensorRT Installation Guide](https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html). You may also try installing torch2trt inside one of the NGC PyTorch docker containers for [Desktop](https://ngc.nvidia.com/catalog/containers/nvidia:pytorch) or [Jetson](https://ngc.nvidia.com/catalog/containers/nvidia:l4t-pytorch).
### Step 1 - Install the torch2trt Python library
To install the torch2trt Python library, call the following
```bash
git clone https://github.com/NVIDIA-AI-IOT/torch2trt
cd torch2trt
python setup.py install
```
### Step 2 (optional) - Install the torch2trt plugins library
To install the torch2trt plugins library, call the following
```bash
cmake -B build . && cmake --build build --target install && ldconfig
```
This includes support for some layers which may not be supported natively by TensorRT. Once this library is found in the system, the associated layer converters in torch2trt are implicitly enabled.
> Note: torch2trt now maintains plugins as an independent library compiled with CMake. This makes compiled TensorRT engines more portable. If needed, the deprecated plugins (which depend on PyTorch) may still be installed by calling ``python setup.py install --plugins``.
### Step 3 (optional) - Install experimental community contributed features
To install torch2trt with experimental community contributed features under ``torch2trt.contrib``, like Quantization Aware Training (QAT)(`requires TensorRT>=7.0`), call the following,
```bash
git clone https://github.com/NVIDIA-AI-IOT/torch2trt
cd torch2trt/scripts
bash build_contrib.sh
```
This enables you to run the QAT example located [here](examples/contrib/quantization_aware_training).
## How does it work?
This converter works by attaching conversion functions (like ``convert_ReLU``) to the original
PyTorch functional calls (like ``torch.nn.ReLU.forward``). The sample input data is passed
through the network, just as before, except now whenever a registered function (``torch.nn.ReLU.forward``)
is encountered, the corresponding converter (``convert_ReLU``) is also called afterwards. The converter
is passed the arguments and return statement of the original PyTorch function, as well as the TensorRT
network that is being constructed. The input tensors to the original PyTorch function are modified to
have an attribute ``_trt``, which is the TensorRT counterpart to the PyTorch tensor. The conversion function
uses this ``_trt`` to add layers to the TensorRT network, and then sets the ``_trt`` attribute for
relevant output tensors. Once the model is fully executed, the final tensors returns are marked as outputs
of the TensorRT network, and the optimized TensorRT engine is built.
## How to add (or override) a converter
Here we show how to add a converter for the ``ReLU`` module using the TensorRT
python API.
```python
import tensorrt as trt
from torch2trt import tensorrt_converter
@tensorrt_converter('torch.nn.ReLU.forward')
def convert_ReLU(ctx):
input = ctx.method_args[1]
output = ctx.method_return
layer = ctx.network.add_activation(input=input._trt, type=trt.ActivationType.RELU)
output._trt = layer.get_output(0)
```
The converter takes one argument, a ``ConversionContext``, which will contain
the following
* ``ctx.network`` - The TensorRT network that is being constructed.
* ``ctx.method_args`` - Positional arguments that were passed to the specified PyTorch function. The ``_trt`` attribute is set for relevant input tensors.
* ``ctx.method_kwargs`` - Keyword arguments that were passed to the specified PyTorch function.
* ``ctx.method_return`` - The value returned by the specified PyTorch function. The converter must set the ``_trt`` attribute where relevant.
Please see [this folder](torch2trt/converters) for more examples.
没有合适的资源?快使用搜索试试~ 我知道了~
资源推荐
资源详情
资源评论
收起资源包目录
TensorRT-pytorch2TRT转换工程-优质实用项目.zip (196个子文件)
example_plugin_test.cpp 12KB
group_norm.cpp 9KB
interpolate.cpp 8KB
reflection_pad_2d_plugin_test.cpp 4KB
plugins.cpp 1KB
tests.cpp 129B
version-select.css 101B
reflection_pad_2d_plugin.cu 9KB
example_plugin.cu 7KB
Dockerfile 235B
Dockerfile 234B
Dockerfile 234B
Dockerfile 73B
reflection_pad_2d_plugin.h 3KB
example_plugin.h 3KB
conversion.ipynb 5KB
live_demo.ipynb 5KB
conversion.ipynb 4KB
version-select.js 2KB
imagenet_labels.json 24KB
README.md 7KB
reduced_precision.md 6KB
jetson_xavier.md 5KB
JETSON_XAVIER.md 5KB
see_also.md 3KB
README.md 3KB
jetson_nano.md 3KB
JETSON_NANO.md 3KB
custom_converter.md 2KB
basic_usage.md 1KB
getting_started.md 1KB
README.md 1KB
README.md 949B
README.md 542B
index.md 446B
pytorch_nvidia_quantization.patch 1KB
fix-getitem.patch 1KB
CLA.pdf 34KB
torch2trt.py 34KB
resnet.py 14KB
quant_conv.py 10KB
utilities.py 9KB
interpolate.py 8KB
clamp.py 8KB
unary.py 7KB
profile_timm.py 7KB
test.py 7KB
train.py 6KB
dataset.py 6KB
instance_norm.py 6KB
getitem.py 6KB
classification.py 5KB
_utils.py 5KB
getitem_test.py 4KB
test_tensor_shape.py 4KB
conv_functional.py 4KB
activation.py 4KB
avg_pool.py 4KB
compare.py 4KB
mod.py 4KB
dataset_test.py 4KB
max_pool1d.py 4KB
layer_norm.py 4KB
QuantConvBN.py 4KB
QuantConv.py 3KB
div.py 3KB
flattener_test.py 3KB
sub.py 3KB
Conv.py 3KB
group_norm.py 3KB
flattener.py 3KB
optimize_recognizer.py 3KB
ConvTranspose.py 3KB
floordiv.py 3KB
infer.py 3KB
mul.py 3KB
add.py 3KB
min.py 3KB
max.py 3KB
split.py 3KB
normalize.py 3KB
transpose.py 3KB
pow.py 3KB
ConvTranspose2d.py 3KB
run_end2end.py 2KB
reflection_pad_2d.py 2KB
dataset_calibrator_test.py 2KB
permute.py 2KB
Conv1d.py 2KB
gelu.py 2KB
roll.py 2KB
parser.py 2KB
optimize_detector.py 2KB
Conv2d.py 2KB
chunk.py 2KB
dynamic_shape_test.py 2KB
mean.py 2KB
sum.py 2KB
build.py 2KB
batch_norm.py 2KB
共 196 条
- 1
- 2
资源评论
极智视界
- 粉丝: 2w+
- 资源: 1556
下载权益
C知道特权
VIP文章
课程特权
开通VIP
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功