项目作者: innovationgarage

项目描述 :
semi-automatic video annotation tool
高级语言: Python
项目地址: git://github.com/innovationgarage/label-V.git
创建时间: 2017-10-27T12:38:09Z
项目社区:https://github.com/innovationgarage/label-V

开源协议:Apache License 2.0

下载


LabelV is a semi-automatic video annotation tool for computer vision training data generation

Installation

  1. sudo apt install ffmpeg
  2. pip install .

Quick start.

  • clone this repository and from the root directory, install it and then run

    labelv-service

  • go to localhost:4711

More detailed explanation and show-case

There is a blog post describing how this is implemented using OpenCV and how it can be used in generating training data for object detection algorithms.

IMAGE ALT TEXT HERE

Data format

Conecpts:

  • Session - a set of keyframes generated by a certain user for a certain video
  • Frame - a video is theoretically made up of concecutive, numbered images
  • Keyframe - a frame annotation created by a user containing labels
  • Label - an object label for an object in the video, such as a chair, a lamp, a bike etc
  • Bbox - a bounding box around an object in the video
  • Title - a string describing a label
  • Group - a label that contains other groups and labels. The bbox of a
    group always exactly contains all the bboxes of its children.

Whenever a video is uploaded it is saved under upload/video/VIDEO_ID.EXT
where VIDEO_ID is a unique random string and EXT is the file format
extension of your video.

Every time a user starts working with a video adding keyframes and
labels, a session is created. The stored under
upload/session/VIDEO_ID.EXT-SESSION_ID where SESSION_ID is a unique
random string. This files contain a json object.

The session object contains a “keyframes” member whose keys are
keyframe frame numbers (as strings due to the json format), and whose
values are keyframe objects:

  1. {"keyframes": {"14": KEYFRAME_OBJECT,
  2. "26": KEYFRAME_OBJECT,
  3. "200": KEYFRAME_OBJECT}}

Each keyframe object has a set of labels and a KEYFRAME_KEY. The
KEYFRAME_KEY is a unique id use to identify this particular set of
labels for this particular frame. If the user where to change the
keyframe, a new key would be generated.

  1. KEYFRAME_OBJECT = {"key": "KEYFRAME_KEY",
  2. "data": {"label": ITEM}}

The keyframe labels reside under the key “labels” under the key “data”
and is a recursively defined structure. At each level one of two
possible objects can be present:

A label

  1. ITEM = {"type": "Label",
  2. "args": {"bbox": [208,214,69,84],
  3. "title": "The chair"}}

or a group

  1. ITEM = {"type": "Group".
  2. "args": {"bbox": [208,214,69,84],
  3. "children": [ITEM,ITEM,...],
  4. "title": "Dining group"}}

When a user navigates to a non-keyframe, the tracker tracks the bboxes
from the last keyframe before the current frame, and generates updated
bboxes for all frames in between. These are stored under
upload/tracker/VIDEO_ID.EXT/KEYFRAME_NUMBER/KEYFRAME_KEY/FRAME_NUMBER.json
where FRAME_NUMBER is the frame number minus the keyframe frame number
(so starts from zero). Each such file contains an ITEM as defined
above encoded as json.