<!--
# Copyright 2020-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# * Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# * Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
# * Neither the name of NVIDIA CORPORATION nor the names of its
# contributors may be used to endorse or promote products derived
# from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
# OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-->
[![License](https://img.shields.io/badge/License-BSD3-lightgrey.svg)](https://opensource.org/licenses/BSD-3-Clause)
# Python Backend
The Triton backend for Python. The goal of Python backend is to let you serve
models written in Python by Triton Inference Server without having to write
any C++ code.
## User Documentation
- [Python Backend](#python-backend)
- [User Documentation](#user-documentation)
- [Quick Start](#quick-start)
- [Building from Source](#building-from-source)
- [Usage](#usage)
- [`auto_complete_config`](#auto_complete_config)
- [`initialize`](#initialize)
- [`execute`](#execute)
- [Default Mode](#default-mode)
- [Decoupled mode](#decoupled-mode)
- [Use Cases](#use-cases)
- [Known Issues](#known-issues)
- [`finalize`](#finalize)
- [Model Config File](#model-config-file)
- [Inference Request Parameters](#inference-request-parameters)
- [Managing Python Runtime and Libraries](#managing-python-runtime-and-libraries)
- [Building Custom Python Backend Stub](#building-custom-python-backend-stub)
- [Creating Custom Execution Environments](#creating-custom-execution-environments)
- [Important Notes](#important-notes)
- [Error Handling](#error-handling)
- [Managing Shared Memory](#managing-shared-memory)
- [Multiple Model Instance Support](#multiple-model-instance-support)
- [Running Multiple Instances of Triton Server](#running-multiple-instances-of-triton-server)
- [Business Logic Scripting](#business-logic-scripting)
- [Using BLS with Stateful Models](#using-bls-with-stateful-models)
- [Limitation](#limitation)
- [Interoperability and GPU Support](#interoperability-and-gpu-support)
- [`pb_utils.Tensor.to_dlpack() -> PyCapsule`](#pb_utilstensorto_dlpack---pycapsule)
- [`pb_utils.Tensor.from_dlpack() -> Tensor`](#pb_utilstensorfrom_dlpack---tensor)
- [`pb_utils.Tensor.is_cpu() -> bool`](#pb_utilstensoris_cpu---bool)
- [Input Tensor Device Placement](#input-tensor-device-placement)
- [Frameworks](#frameworks)
- [PyTorch](#pytorch)
- [TensorFlow](#tensorflow)
- [Examples](#examples)
- [AddSub in NumPy](#addsub-in-numpy)
- [AddSubNet in PyTorch](#addsubnet-in-pytorch)
- [AddSub in JAX](#addsub-in-jax)
- [Business Logic Scripting](#business-logic-scripting-1)
- [Preprocessing](#preprocessing)
- [Decoupled Models](#decoupled-models)
- [Running with Inferentia](#running-with-inferentia)
- [Logging](#logging)
- [Reporting problems, asking questions](#reporting-problems-asking-questions)
## Quick Start
1. Run the Triton Inference Server container.
```
docker run --shm-size=1g --ulimit memlock=-1 -p 8000:8000 -p 8001:8001 -p 8002:8002 --ulimit stack=67108864 -ti nvcr.io/nvidia/tritonserver:<xx.yy>-py3
```
Replace \<xx.yy\> with the Triton version (e.g. 21.05).
2. Inside the container, clone the Python backend repository.
```
git clone https://github.com/triton-inference-server/python_backend -b r<xx.yy>
```
3. Install example model.
```
cd python_backend
mkdir -p models/add_sub/1/
cp examples/add_sub/model.py models/add_sub/1/model.py
cp examples/add_sub/config.pbtxt models/add_sub/config.pbtxt
```
4. Start the Triton server.
```
tritonserver --model-repository `pwd`/models
```
5. In the host machine, start the client container.
```
docker run -ti --net host nvcr.io/nvidia/tritonserver:<xx.yy>-py3-sdk /bin/bash
```
6. In the client container, clone the Python backend repository.
```
git clone https://github.com/triton-inference-server/python_backend -b r<xx.yy>
```
7. Run the example client.
```
python3 python_backend/examples/add_sub/client.py
```
## Building from Source
1. Requirements
* cmake >= 3.17
* numpy
* rapidjson-dev
* libarchive-dev
* zlib1g-dev
```
pip3 install numpy
```
On Ubuntu or Debian you can use the command below to install `rapidjson`, `libarchive`, and `zlib`:
```
sudo apt-get install rapidjson-dev libarchive-dev zlib1g-dev
```
2. Build Python backend. Replace \<GIT\_BRANCH\_NAME\> with the GitHub branch
that you want to compile. For release branches it should be r\<xx.yy\> (e.g.
r21.06).
```
mkdir build
cd build
cmake -DTRITON_ENABLE_GPU=ON -DTRITON_BACKEND_REPO_TAG=<GIT_BRANCH_NAME> -DTRITON_COMMON_REPO_TAG=<GIT_BRANCH_NAME> -DTRITON_CORE_REPO_TAG=<GIT_BRANCH_NAME> -DCMAKE_INSTALL_PREFIX:PATH=`pwd`/install ..
make install
```
The following required Triton repositories will be pulled and used in
the build. If the CMake variables below are not specified, "main" branch
of those repositories will be used. \<GIT\_BRANCH\_NAME\> should be the same
as the Python backend repository branch that you are trying to compile.
* triton-inference-server/backend: `-DTRITON_BACKEND_REPO_TAG=<GIT_BRANCH_NAME>`
* triton-inference-server/common: `-DTRITON_COMMON_REPO_TAG=<GIT_BRANCH_NAME>`
* triton-inference-server/core: `-DTRITON_CORE_REPO_TAG=<GIT_BRANCH_NAME>`
Set `-DCMAKE_INSTALL_PREFIX` to the location where the Triton Server is
installed. In the released containers, this location is `/opt/tritonserver`.
3. Copy example model and configuration
```
mkdir -p models/add_sub/1/
cp examples/add_sub/model.py models/add_sub/1/model.py
cp examples/add_sub/config.pbtxt models/add_sub/config.pbtxt
```
4. Start the Triton Server
```
/opt/tritonserver/bin/tritonserver --model-repository=`pwd`/models
```
5. Use the client app to perform inference
```
python3 examples/add_sub/client.py
```
## Usage
In order to use the Python backend, you need to create a Python file that
has a structure similar to below:
```python
import triton_python_backend_utils as pb_utils
class TritonPythonModel:
"""Your Python model must use the same class name. Every Python model
that is created must have "TritonPythonModel" as the class name.
"""
@staticmethod
def auto_complete_config(auto_complete_model_config):
"""`auto_complete_config` is called only once when loading the model
assuming the server was not started with
`--disable-auto-complete-config`. Implementing this function is
optional. No implementation of `auto_complete_config` will do nothing.
This function can be used to set `max_batch_size`, `input` and `output`
properties of the model using `set_max_batch_size`, `add_input`, and
`add_output`. These properties will allow Triton to load the model with
minimal model configuration in absence of a configuration file.
没有合适的资源?快使用搜索试试~ 我知道了~
Triton后端,可以在Python中实现预处理、后处理和其他逻辑。_C++_Python_下载.zip
共114个文件
py:28个
h:26个
cc:24个
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
0 下载量 113 浏览量
2023-04-26
11:08:10
上传
评论
收藏 288KB ZIP 举报
温馨提示
Triton后端,可以在Python中实现预处理、后处理和其他逻辑。_C++_Python_下载.zip
资源推荐
资源详情
资源评论
收起资源包目录
Triton后端,可以在Python中实现预处理、后处理和其他逻辑。_C++_Python_下载.zip (114个子文件)
python_be.cc 75KB
pb_stub.cc 54KB
infer_request.cc 21KB
stub_launcher.cc 20KB
pb_tensor.cc 17KB
request_executor.cc 15KB
infer_response.cc 13KB
pb_memory.cc 12KB
pb_env.cc 10KB
pb_stub_utils.cc 10KB
pb_utils.cc 8KB
shm_manager.cc 7KB
response_sender.cc 7KB
ipc_message.cc 5KB
pb_string.cc 4KB
pb_response_iterator.cc 4KB
pb_log.cc 4KB
pb_metric_reporter.cc 4KB
pb_map.cc 4KB
memory_manager.cc 3KB
infer_payload.cc 3KB
pb_error.cc 2KB
shm_monitor.cc 2KB
scoped_defer.cc 2KB
.clang-format 773B
.gitignore 2KB
python_be.h 17KB
pb_stub.h 10KB
message_queue.h 9KB
pb_utils.h 9KB
pb_tensor.h 9KB
shm_manager.h 8KB
stub_launcher.h 6KB
infer_request.h 6KB
pb_memory.h 6KB
infer_response.h 6KB
ipc_message.h 5KB
pb_log.h 3KB
memory_manager.h 3KB
pb_string.h 3KB
pb_map.h 3KB
infer_payload.h 3KB
pb_stub_utils.h 3KB
pb_metric_reporter.h 3KB
pb_error.h 2KB
request_executor.h 2KB
pb_response_iterator.h 2KB
pb_preferred_memory.h 2KB
pb_env.h 2KB
response_sender.h 2KB
pb_exception.h 2KB
scoped_defer.h 2KB
TritonPythonBackendConfig.cmake.in 2KB
libtriton_python.ldscript 2KB
LICENSE 1KB
README.md 58KB
README.md 15KB
README.md 8KB
README.md 7KB
README.md 7KB
README.md 6KB
README.md 5KB
README.md 4KB
README.md 2KB
config.pbtxt 2KB
repeat_config.pbtxt 2KB
sync_config.pbtxt 2KB
async_config.pbtxt 2KB
config.pbtxt 2KB
config.pbtxt 2KB
config.pbtxt 2KB
square_config.pbtxt 2KB
config.pbtxt 2KB
config.pbtxt 2KB
config.pbtxt 2KB
async_config.pbtxt 2KB
sync_config.pbtxt 2KB
gen_triton_model.py 33KB
triton_python_backend_utils.py 18KB
repeat_model.py 12KB
square_model.py 11KB
batch_model.py 10KB
nobatch_model.py 10KB
async_model.py 7KB
async_model.py 7KB
model.py 7KB
sync_model.py 7KB
model.py 7KB
model.py 6KB
sync_model.py 6KB
model.py 6KB
sync_client.py 5KB
square_client.py 5KB
repeat_client.py 5KB
client.py 4KB
client.py 4KB
client.py 4KB
model.py 4KB
async_client.py 3KB
client.py 3KB
共 114 条
- 1
- 2
资源评论
快撑死的鱼
- 粉丝: 1w+
- 资源: 9154
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功