项目作者: csvance

项目描述 :
Example of loading a Keras model into TensorRT C++ API
高级语言: Jupyter Notebook
项目地址: git://github.com/csvance/keras-tensorrt-jetson.git
创建时间: 2018-05-23T02:09:28Z
项目社区:https://github.com/csvance/keras-tensorrt-jetson

开源协议:MIT License

下载


keras-tensorrt-jetson

nVidia’s Jetson platform is arguably the most powerful family of devices for deep learning at the edge. In order to achieve the full benefits of the platform, a framework called TensorRT drastically reduces inference time for supported network architectures and layers. However, nVidia does not currently make it easy to take your existing models from Keras/Tensorflow and deploy them on the Jetson with TensorRT. One reason for this is the python API for TensorRT only supports x86 based architectures. This leaves us with no real easy way of taking advantage of the benefits of TensorRT. However, there is a harder way that does work: To achieve maximum inference performance we can export and convert our model to .uff format, and then load it in TensorRT’s C++ API.

1. Training and exporting to .pb

  • Train your model
  • If using Jupyter, restart the kernel you trained your model in to remove training layers from the graph
  • Reload the models weights
  • Use an export function like the one in this notebook to export the graph to a .pb file

2. Converting .pb to .uff

I suggest using the chybhao666/cuda9_cudnn7_tensorrt3.0:latest Docker container to access the script needed for converting a .pb export from Keras/Tensorflow to .uff format for TensorRT import.

  1. cd /usr/lib/python2.7/dist-packages/uff/bin
  2. # List Layers and manually pick out the output layer
  3. # For most networks it will be dense_x/BiasAdd, the last one that isn't a placeholder or activation layer
  4. python convert_to_uff.py tensorflow --input-file /path/to/graph.pb -l
  5. # Convert to .uff, replace dense_1/BiasAdd with the name of your output layer
  6. python convert_to_uff.py tensorflow -o /path/to/graph.uff --input-file /path/to/graph.pb -O dense_1/BiasAdd

More information on the .pb export and .uff conversion is available from nVidia

3. Loading the .uff into TensorRT C++ Inference API

I have create a generic class which can load the graph from a .uff file and setup TensorRT for inference. It supports any number of inputs and outputs and is available on my Github. It can be built with nVidia nSight Eclipse Edition using a remote toolchain (instructions here)

Caveats

  • Keep in mind that many layers are not supported by TensorRT 3.0. The most obvious omission is Keras particular implementation of BatchNorm.
  • Concatenate only works on the channel axis and if and only if the other dimensions are the same. If you have multiple paths for convolution, you are limited to concatenating them only when they have the same dimensions.

nVidia DIGITS TensorRT Inference Nodes for ROS

  • Support for DetectNet, ImageNet, and soon SegNet!
  • Includes a flexible abstraction layer for TensorRT
  • Github