TensorRT Deployment¶

TensorRT Deployment
- Introduction
- List of TensorRT plugins supported in MMCV
- How to build TensorRT plugins in MMCV
  - Prerequisite
  - Build on Linux
- Create TensorRT engine and run inference in python
- How to add a TensorRT plugin for custom op in MMCV
  - Main procedures
  - Reminders
- Known Issues
- References

Introduction¶

NVIDIA TensorRT is a software development kit(SDK) for high-performance inference of deep learning models. It includes a deep learning inference optimizer and runtime that delivers low latency and high-throughput for deep learning inference applications. Please check its developer’s website for more information. To ease the deployment of trained models with custom operators from mmcv.ops using TensorRT, a series of TensorRT plugins are included in MMCV.

List of TensorRT plugins supported in MMCV¶

ONNX Operator	TensorRT Plugin	MMCV Releases
MMCVRoiAlign	MMCVRoiAlign	1.2.6
ScatterND	ScatterND	1.2.6
NonMaxSuppression	NonMaxSuppression	1.3.0
MMCVDeformConv2d	MMCVDeformConv2d	1.3.0
grid_sampler	grid_sampler	1.3.1
cummax	cummax	1.3.5
cummin	cummin	1.3.5
MMCVInstanceNormalization	MMCVInstanceNormalization	1.3.5
MMCVModulatedDeformConv2d	MMCVModulatedDeformConv2d	1.3.8

Notes

All plugins listed above are developed on TensorRT-7.2.1.6.Ubuntu-16.04.x86_64-gnu.cuda-10.2.cudnn8.0

How to build TensorRT plugins in MMCV¶

Prerequisite¶

Clone repository

git clone https://github.com/open-mmlab/mmcv.git

Install TensorRT

Download the corresponding TensorRT build from NVIDIA Developer Zone.

For example, for Ubuntu 16.04 on x86-64 with cuda-10.2, the downloaded file is TensorRT-7.2.1.6.Ubuntu-16.04.x86_64-gnu.cuda-10.2.cudnn8.0.tar.gz.

Then, install as below:

cd ~/Downloads
tar -xvzf TensorRT-7.2.1.6.Ubuntu-16.04.x86_64-gnu.cuda-10.2.cudnn8.0.tar.gz
export TENSORRT_DIR=`pwd`/TensorRT-7.2.1.6
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$TENSORRT_DIR/lib

Install python packages: tensorrt, graphsurgeon, onnx-graphsurgeon

pip install $TENSORRT_DIR/python/tensorrt-7.2.1.6-cp37-none-linux_x86_64.whl
pip install $TENSORRT_DIR/onnx_graphsurgeon/onnx_graphsurgeon-0.2.6-py2.py3-none-any.whl
pip install $TENSORRT_DIR/graphsurgeon/graphsurgeon-0.4.5-py2.py3-none-any.whl

For more detailed information of installing TensorRT using tar, please refer to Nvidia’ website.

Install cuDNN

Install cuDNN 8 following Nvidia’ website.

Build on Linux¶

cd mmcv ## to MMCV root directory
MMCV_WITH_OPS=1 MMCV_WITH_TRT=1 pip install -e .

Create TensorRT engine and run inference in python¶

Here is an example.

import torch
import onnx

from mmcv.tensorrt import (TRTWrapper, onnx2trt, save_trt_engine,
                                   is_tensorrt_plugin_loaded)

assert is_tensorrt_plugin_loaded(), 'Requires to complie TensorRT plugins in mmcv'

onnx_file = 'sample.onnx'
trt_file = 'sample.trt'
onnx_model = onnx.load(onnx_file)

## Model input
inputs = torch.rand(1, 3, 224, 224).cuda()
## Model input shape info
opt_shape_dict = {
    'input': [list(inputs.shape),
              list(inputs.shape),
              list(inputs.shape)]
}

## Create TensorRT engine
max_workspace_size = 1 << 30
trt_engine = onnx2trt(
    onnx_model,
    opt_shape_dict,
    max_workspace_size=max_workspace_size)

## Save TensorRT engine
save_trt_engine(trt_engine, trt_file)

## Run inference with TensorRT
trt_model = TRTWrapper(trt_file, ['input'], ['output'])

with torch.no_grad():
    trt_outputs = trt_model({'input': inputs})
    output = trt_outputs['output']

How to add a TensorRT plugin for custom op in MMCV¶

Main procedures¶

Below are the main steps:

Add c++ header file
Add c++ source file
Add cuda kernel file
Register plugin in trt_plugin.cpp
Add unit test in tests/test_ops/test_tensorrt.py

Take RoIAlign plugin roi_align for example.

Add header trt_roi_align.hpp to TensorRT include directory mmcv/ops/csrc/tensorrt/
Add source trt_roi_align.cpp to TensorRT source directory mmcv/ops/csrc/tensorrt/plugins/
Add cuda kernel trt_roi_align_kernel.cu to TensorRT source directory mmcv/ops/csrc/tensorrt/plugins/

Register roi_align plugin in trt_plugin.cpp

#include "trt_plugin.hpp"

#include "trt_roi_align.hpp"

REGISTER_TENSORRT_PLUGIN(RoIAlignPluginDynamicCreator);

extern "C" {
bool initLibMMCVInferPlugins() { return true; }
}  // extern "C"

Add unit test into tests/test_ops/test_tensorrt.py Check here for examples.

Reminders¶

Please note that this feature is experimental and may change in the future. Strongly suggest users always try with the latest master branch.
Some of the custom ops in mmcv have their cuda implementations, which could be referred.

Known Issues¶

None