Shortcuts

TensorRT Deployment

Introduction

NVIDIA TensorRT is a software development kit(SDK) for high-performance inference of deep learning models. It includes a deep learning inference optimizer and runtime that delivers low latency and high-throughput for deep learning inference applications. Please check its developer’s website for more information. To ease the deployment of trained models with custom operators from mmcv.ops using TensorRT, a series of TensorRT plugins are included in MMCV.

List of TensorRT plugins supported in MMCV

ONNX Operator TensorRT Plugin MMCV Releases
MMCVRoiAlign MMCVRoiAlign 1.2.6
ScatterND ScatterND 1.2.6
NonMaxSuppression NonMaxSuppression 1.3.0
MMCVDeformConv2d MMCVDeformConv2d 1.3.0
grid_sampler grid_sampler 1.3.1
cummax cummax 1.3.5
cummin cummin 1.3.5
MMCVInstanceNormalization MMCVInstanceNormalization 1.3.5
MMCVModulatedDeformConv2d MMCVModulatedDeformConv2d 1.3.8

Notes

  • All plugins listed above are developed on TensorRT-7.2.1.6.Ubuntu-16.04.x86_64-gnu.cuda-10.2.cudnn8.0

How to build TensorRT plugins in MMCV

Prerequisite

  • Clone repository

git clone https://github.com/open-mmlab/mmcv.git
  • Install TensorRT

Download the corresponding TensorRT build from NVIDIA Developer Zone.

For example, for Ubuntu 16.04 on x86-64 with cuda-10.2, the downloaded file is TensorRT-7.2.1.6.Ubuntu-16.04.x86_64-gnu.cuda-10.2.cudnn8.0.tar.gz.

Then, install as below:

cd ~/Downloads
tar -xvzf TensorRT-7.2.1.6.Ubuntu-16.04.x86_64-gnu.cuda-10.2.cudnn8.0.tar.gz
export TENSORRT_DIR=`pwd`/TensorRT-7.2.1.6
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$TENSORRT_DIR/lib

Install python packages: tensorrt, graphsurgeon, onnx-graphsurgeon

pip install $TENSORRT_DIR/python/tensorrt-7.2.1.6-cp37-none-linux_x86_64.whl
pip install $TENSORRT_DIR/onnx_graphsurgeon/onnx_graphsurgeon-0.2.6-py2.py3-none-any.whl
pip install $TENSORRT_DIR/graphsurgeon/graphsurgeon-0.4.5-py2.py3-none-any.whl

For more detailed information of installing TensorRT using tar, please refer to Nvidia’ website.

  • Install cuDNN

Install cuDNN 8 following Nvidia’ website.

Build on Linux

cd mmcv ## to MMCV root directory
MMCV_WITH_OPS=1 MMCV_WITH_TRT=1 pip install -e .

Create TensorRT engine and run inference in python

Here is an example.

import torch
import onnx

from mmcv.tensorrt import (TRTWrapper, onnx2trt, save_trt_engine,
                                   is_tensorrt_plugin_loaded)

assert is_tensorrt_plugin_loaded(), 'Requires to complie TensorRT plugins in mmcv'

onnx_file = 'sample.onnx'
trt_file = 'sample.trt'
onnx_model = onnx.load(onnx_file)

## Model input
inputs = torch.rand(1, 3, 224, 224).cuda()
## Model input shape info
opt_shape_dict = {
    'input': [list(inputs.shape),
              list(inputs.shape),
              list(inputs.shape)]
}

## Create TensorRT engine
max_workspace_size = 1 << 30
trt_engine = onnx2trt(
    onnx_model,
    opt_shape_dict,
    max_workspace_size=max_workspace_size)

## Save TensorRT engine
save_trt_engine(trt_engine, trt_file)

## Run inference with TensorRT
trt_model = TRTWrapper(trt_file, ['input'], ['output'])

with torch.no_grad():
    trt_outputs = trt_model({'input': inputs})
    output = trt_outputs['output']

How to add a TensorRT plugin for custom op in MMCV

Main procedures

Below are the main steps:

  1. Add c++ header file

  2. Add c++ source file

  3. Add cuda kernel file

  4. Register plugin in trt_plugin.cpp

  5. Add unit test in tests/test_ops/test_tensorrt.py

Take RoIAlign plugin roi_align for example.

  1. Add header trt_roi_align.hpp to TensorRT include directory mmcv/ops/csrc/tensorrt/

  2. Add source trt_roi_align.cpp to TensorRT source directory mmcv/ops/csrc/tensorrt/plugins/

  3. Add cuda kernel trt_roi_align_kernel.cu to TensorRT source directory mmcv/ops/csrc/tensorrt/plugins/

  4. Register roi_align plugin in trt_plugin.cpp

    #include "trt_plugin.hpp"
    
    #include "trt_roi_align.hpp"
    
    REGISTER_TENSORRT_PLUGIN(RoIAlignPluginDynamicCreator);
    
    extern "C" {
    bool initLibMMCVInferPlugins() { return true; }
    }  // extern "C"
    
  5. Add unit test into tests/test_ops/test_tensorrt.py Check here for examples.

Reminders

  • Please note that this feature is experimental and may change in the future. Strongly suggest users always try with the latest master branch.

  • Some of the custom ops in mmcv have their cuda implementations, which could be referred.

Known Issues

  • None

Read the Docs v: latest
Versions
master
latest
2.x
v1.7.0
v1.6.2
v1.6.1
v1.6.0
v1.5.3
v1.5.2_a
v1.5.1
v1.5.0
v1.4.8
v1.4.7
v1.4.6
v1.4.5
v1.4.4
v1.4.3
v1.4.2
v1.4.1
v1.4.0
v1.3.18
v1.3.17
v1.3.16
v1.3.15
v1.3.14
v1.3.13
v1.3.12
v1.3.11
v1.3.10
v1.3.9
v1.3.8
v1.3.7
v1.3.6
v1.3.5
v1.3.4
v1.3.3
v1.3.2
v1.3.1
v1.3.0
Downloads
pdf
html
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.