项目作者: canhld94

项目描述 :
Serving object detection models on different hardware.
高级语言: C++
项目地址: git://github.com/canhld94/HeteroServing.git
创建时间: 2020-07-29T09:06:30Z
项目社区:https://github.com/canhld94/HeteroServing

开源协议:

下载


HeteroServing

Serving object detection models on different hardware.

Related links:

Training models on DOTA dataset with Tensorflow object detection

Pre-trained models

Model converter guide

Introduction

TL;DR This project (1) implement object detection models with OpenVINO and TensorRT, (2) implement servers with REST and gRPC endpoint to serve the object detection service.

This project build an inference server with Intel FPGA Intel CPU, Intel FPGA, and NVIDIA GPU backend. The inference engine intends to support object detection object detection models (SSD, YoLov3*, and Faster R-CNN family) (please find the supported models in the release notes); and the server support REST gRPC and REST. Ones can use this project as a back-end in a complete serving framework, or use it as a standalone serving server in small applications.

At a glance:

Request

  1. curl --location --request POST 'xxx.xxx.xxx.xxx:8080/inference' \
  2. --header 'Content-Type: image/jpeg' \
  3. --data-binary '@/C:/Users/CanhLD/Desktop/Demo/AirbusDrone.jpg'

Return

  1. {
  2. "predictions": [
  3. {
  4. "label_id": "1",
  5. "label": "plane",
  6. "confidences": "0.998418033",
  7. "detection_box": [
  8. "182",
  9. "806",
  10. "291",
  11. "919"
  12. ]
  13. },
  14. {
  15. "label_id": "1",
  16. "label": "plane",
  17. "confidences": "0.997635841",
  18. "detection_box": [
  19. "26",
  20. "182",
  21. "137",
  22. "309"
  23. ]
  24. }
  25. ]
  26. }

NOTE: I do not implement yolo for GPU

Requirements

The server object and protocol object depends on following packages. I strongly recommend install them with Conan, so you do not need to modify the CMake files.

  1. boost==1.73.0: socket and IPC, networking, HTTP parsing and serializing, JSON parsing and serializing
  2. spdlog==1.7.0: logging
  3. glfags==2.2.2: argv parsing

For inference engine, I implemented CPU and FPGA inference with Intel OpenVino, and GPU inference with NVIDIA TensorRT. In order to using grpc, you should install grpc for C++. Please refer to their document to install the framework.

  1. grpc==1.32.0
  2. openvino==2019R1.1
  3. opencv==4.1 (should comes with openvino)
  4. tensorrt==7.1.3.4
  5. cuda==10.2 (should comes with tensorrt)

NOTE: As these packages are quite big with lots of dependencies, make sure you install them correctly w/o conflict and successfully compile the helloword examples.

In general, if you install these package in default location, CMake will find them eventually. Otherwise, add your install directory to CMAKE_PREFIX_PATH. For example, I defined following variables in my environments and let CMake retrieve them with $ENV{}.

  1. # FPGA path
  2. export ALTERAOCLSDKROOT="/opt/altera/aocl-pro-rte/aclrte-linux64"
  3. export INTELFPGAOCLSDKROOT=$ALTERAOCLSDKROOT
  4. # Intel OpenCL FPGA runtime
  5. export AOCL_BOARD_PACKAGE_ROOT="$INTELFPGAOCLSDKROOT/board/a10_ref"
  6. source "$INTELFPGAOCLSDKROOT/init_opencl.sh"
  7. # Intel OpeVino
  8. export INTEL_OPENVINO_DIR="/opt/intel/openvino_2019.1.144"
  9. source "$INTEL_OPENVINO_DIR/bin/setupvars.sh"
  10. # CUDA and TensorRT
  11. export CUDA_INSTALL_DIR=/usr/local/cuda-10.2/
  12. export TensorRT_ROOT=/home/canhld/softwares/TensorRT-7.1.3.4/
  13. export TRT_LIB_DIR=$TensorRT_ROOT/targets/x86_64-linux-gnu/lib/
  14. export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$TRT_LIB_DIR:$CUDA_INSTALL_DIR/lib64/

The following tools are required to build the project. If you don’t want to use Conan, manually add cmake modules to the CMake files.

  1. GCC>=5
  2. CMake>=3.13
  3. Conan

Directory structure

  1. .
  2. ├── client >> Client code
  3. ├── cmakes >> CMake modules
  4. ├── docs >> Document
  5. ├── server >> Server code
  6. ├── _experimental >> Junk code, experimental ideas but not yet success
  7. ├── config >> Server configuration
  8. ├── libs >> Libraries that implement components of the project
  9. └── apps >> The serving application

How to build the project

Make sure you have CMake and Conan and required framework (TensorRT, OpenVINO, gRPC)

  1. git clone https://github.com/canhld94/HeteroServing.git
  2. cd something
  3. mkdir bin
  4. mkdir build && cd build
  5. conan install ..
  6. cmake -DCMAKE_BUILD_TYPE=Release ..
  7. cmake --build .
  8. cmake install

Every binary file, include conan package binaries will be installed in the bin folder.

How to run the server

  1. Understand the server architecture

  2. Download the pre-trained models here

  3. Configure your server file with proper model and number of inference engines on each device

  4. Go to bin folder and run the server

  1. cd bin
  2. ./serving - f [link to you server config file]
  3. # For example: ./serving -f ../server/config/config_ssd.json
  4. # This will start the server; the endpoint for inference with REST is `/inference`
  5. # Send any image to the endpoint and server will return detection result in JSON format.
  1. On another terminal, go to run client folder and run client, the result will be written to file "testing.jpg"
  1. # go to correct http or grpc folder
  2. python simple_client.py