TensorRT6.0.1.5.Windows10版本.x86_64平台.cuda-10.0.cudnn7.6

共1240个文件

h：64个

cpp：42个

md：21个

cuda

tensorrt

需积分: 23 118 浏览量 2022-03-17 14:08:09 上传评论收藏 393.71MB ZIP 举报

资源详情

资源评论

资源推荐

收起资源包目录

TensorRT 6.0.1.5.Windows10版本.x86_64平台.cuda-10.0.cudnn7.6 （1240个子文件）

batch0 307KB

batch1 307KB

batch10 307KB

batch100 307KB

batch1000 307KB

batch1001 307KB

batch1002 307KB

batch101 307KB

batch102 307KB

batch103 307KB

batch104 307KB

batch105 307KB

batch106 307KB

batch107 307KB

batch108 307KB

batch109 307KB

batch11 307KB

batch110 307KB

batch111 307KB

batch112 307KB

batch113 307KB

batch114 307KB

batch115 307KB

batch116 307KB

batch117 307KB

batch118 307KB

batch119 307KB

batch12 307KB

batch120 307KB

batch121 307KB

batch122 307KB

batch123 307KB

batch124 307KB

batch125 307KB

batch126 307KB

batch127 307KB

batch128 307KB

batch129 307KB

batch13 307KB

batch130 307KB

batch131 307KB

batch132 307KB

batch133 307KB

batch134 307KB

batch135 307KB

batch136 307KB

batch137 307KB

batch138 307KB

batch139 307KB

batch14 307KB

batch140 307KB

batch141 307KB

batch142 307KB

batch143 307KB

batch144 307KB

batch145 307KB

batch146 307KB

batch147 307KB

batch148 307KB

batch149 307KB

batch15 307KB

batch150 307KB

batch151 307KB

batch152 307KB

batch153 307KB

batch154 307KB

batch155 307KB

batch156 307KB

batch157 307KB

batch158 307KB

batch159 307KB

batch16 307KB

batch160 307KB

batch161 307KB

batch162 307KB

batch163 307KB

batch164 307KB

batch165 307KB

batch166 307KB

batch167 307KB

batch168 307KB

batch169 307KB

batch17 307KB

batch170 307KB

batch171 307KB

batch172 307KB

batch173 307KB

batch174 307KB

batch175 307KB

batch176 307KB

batch177 307KB

batch178 307KB

batch179 307KB

batch18 307KB

batch180 307KB

batch181 307KB

batch182 307KB

batch183 307KB

batch184 307KB

batch185 307KB

共 1240 条

# Object Detection With A TensorFlow SSD Network **Table Of Contents** - [Description](#description) - [How does this sample work?](#how-does-this-sample-work) * [Processing the input graph](#processing-the-input-graph) * [Preparing the data](#preparing-the-data) * [sampleUffSSD plugins](#sampleuffssd-plugins) * [Verifying the output](#verifying-the-output) * [TensorRT API layers and ops](#tensorrt-api-layers-and-ops) - [Prerequisites](#prerequisites) - [Running the sample](#running-the-sample) * [Sample `--help` options](#sample---help-options) - [Additional resources](#additional-resources) - [License](#license) - [Changelog](#changelog) - [Known issues](#known-issues) ## Description This sample, sampleUffSSD, preprocesses a TensorFlow SSD network, performs inference on the SSD network in TensorRT, using TensorRT plugins to speed up inference. This sample is based on the [SSD: Single Shot MultiBox Detector](https://arxiv.org/abs/1512.02325) paper. The SSD network performs the task of object detection and localization in a single forward pass of the network. The SSD network used in this sample is based on the TensorFlow implementation of SSD, which actually differs from the original paper, in that it has an inception_v2 backbone. For more information about the actual model, download [ssd_inception_v2_coco](http://download.tensorflow.org/models/object_detection/ssd_inception_v2_coco_2017_11_17.tar.gz). The TensorFlow SSD network was trained on the InceptionV2 architecture using the [MSCOCO dataset](http://cocodataset.org/#home) which has 91 classes (including the background class). The config details of the network can be found [here](https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/ssd_inception_v2_coco.config). ## How does this sample work? The SSD network performs the task of object detection and localization in a single forward pass of the network. The TensorFlow SSD network was trained on the InceptionV2 architecture using the MSCOCO dataset. The sample makes use of TensorRT plugins to run the SSD network. To use these plugins, the TensorFlow graph needs to be preprocessed, and we use the GraphSurgeon utility to do this. The main components of this network are the Image Preprocessor, FeatureExtractor, BoxPredictor, GridAnchorGenerator and Postprocessor. **Image Preprocessor** The image preprocessor step of the graph is responsible for resizing the image. The image is resized to a 300x300x3 size tensor. This step also performs normalization of the image so all pixel values lie between the range [-1, 1]. **FeatureExtractor** The FeatureExtractor portion of the graph runs the InceptionV2 network on the preprocessed image. The feature maps generated are used by the anchor generation step to generate default bounding boxes for each feature map. In this network, the size of feature maps that are used for anchor generation are [(19x19), (10x10), (5x5), (3x3), (2x2), (1x1)]. **BoxPredictor** The BoxPredictor step takes in a high level feature map as input and produces a list of box encodings (x-y coordinates) and a list of class scores for each of these encodings per feature map. This information is passed to the postprocessor. **GridAnchorGenerator** The goal of this step is to generate a set of default bounding boxes (given the scale and aspect ratios mentioned in the config) for each feature map cell. This is implemented as a plugin layer in TensorRT called the `gridAnchorGenerator` plugin. The registered plugin name is `GridAnchor_TRT`. **Postprocessor** The postprocessor step performs the final steps to generate the network output. The bounding box data and confidence scores for all feature maps are fed to the step along with the pre-computed default bounding boxes (generated in the `GridAnchorGenerator` namespace). It then performs NMS (non-maximum suppression) which prunes away most of the bounding boxes based on a confidence threshold and IoU (Intersection over Union) overlap, thus storing only the top `N` boxes per class. This is implemented as a plugin layer in TensorRT called the NMS plugin. The registered plugin name is `NMS_TRT`. **Note:** This sample also implements another plugin called `FlattenConcat` which is used to flatten each input and then concatenate the results. This is applied to the location and confidence data before it is fed to the post processor step since the NMS plugin requires the data to be in this format. For details on how a plugin is implemented, see the implementation of `FlattenConcat` plugin and `FlattenConcatPluginCreator` in the `sampleUffSSD.cpp` file in the `tensorrt/samples/sampleUffSSD` directory. Specifically, this sample performs the following steps: - [Processing the input graph](#processing-the-input-graph) - [Preparing the data](#preparing-the-data) - [sampleUffSSD plugins](#sampleuffssd-plugins) - [Verifying the output](#verifying-the-output) ### Processing the input graph The TensorFlow SSD graph has some operations that are currently not supported in TensorRT. Using a preprocessor on the graph, we can combine multiple operations in the graph into a single custom operation which can be implemented as a plugin layer in TensorRT. Currently, the preprocessor provides the ability to stitch all nodes within a namespace into one custom node. To use the preprocessor, the `convert-to-uff` utility should be called with a `-p` flag and a config file. The config script should also include attributes for all custom plugins which will be embedded in the generated `.uff` file. Current sample script for SSD is located in `/usr/src/tensorrt/samples/sampleUffSSD/config.py`. Using the preprocessor on the graph, we were able to remove the `Preprocessor` namespace from the graph, stitch the `GridAnchorGenerator` namespace together to create the `GridAnchorGenerator` plugin, stitch the `postprocessor` namespace together to get the NMS plugin and mark the concat operations in the BoxPredictor as `FlattenConcat` plugins. The TensorFlow graph has some operations like `Assert` and `Identity` which can be removed for the inferencing. Operations like `Assert` are removed and leftover nodes (with no outputs once assert is deleted) are then recursively removed. `Identity` operations are deleted and the input is forwarded to all the connected outputs. Additional documentation on the graph preprocessor can be found in the [TensorRT API](https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/python_api/graphsurgeon/graphsurgeon.html). ### Preparing the data The generated network has an input node called `Input`, and the output node is given the name `MarkOutput_0` by the UFF converter. These nodes are registered by the UFF Parser in the sample. ``` parser->registerInput("Input", DimsCHW(3, 300, 300), UffInputOrder::kNCHW); parser->registerOutput("MarkOutput_0"); ``` The input to the SSD network in this sample is 3 channel 300x300 images. In the sample, we normalize the image so the pixel values lie in the range [-1,1]. This is equivalent to the image preprocessing stage of the network. Since TensorRT does not depend on any computer vision libraries, the images are represented in binary `R`, `G`, and `B` values for each pixel. The format is Portable PixMap (PPM), which is a netpbm color image format. In this format, the `R`, `G`, and `B` values for each pixel are represented by a byte of integer (0-255) and they are stored together, pixel by pixel. There is a simple PPM reading function called `readPPMFile`. ### sampleUffSSD plugins Details about how to create TensorRT plugins can be found in [Extending TensorRT With Custom Layers](https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#extending). The `config.py` defined for the `convert-to-uff` command should have the custom layers mapped to the plugin names in TensorRT by modifying the `op` field. The names of the plugin parameters should also exactly match those expect