项目作者: UniversalDataTool

项目描述 :
Convert Universal Data Tool files to the YOLO format
高级语言: TypeScript
项目地址: git://github.com/UniversalDataTool/udt-to-yolo.git
创建时间: 2021-02-04T18:14:04Z
项目社区:https://github.com/UniversalDataTool/udt-to-yolo

开源协议:

下载


UDT to YOLO

Converts files in the Universal Data Tool format to the
YOLOv1.1 format.

Note: We do a variation on the YOLO format. Each sample gets it’s own subdirectory which is a valid YOLO
dataset. We may change this in the future.

Usage

You’ll need to have npm installed and ffmpeg installed, then run the command below to
convert the UDT file into the yolo directory.

  1. npx udt-to-yolo ./dataset.udt.json -o yolo-dir

YOLOv1.1 Format

There doesn’t seem to be a formal spec for the YOLOv1.1 format, but the directory
structure is simple enough that we can describe it based on the output of programs
like CVAT.

The YOLOv1.1 format is a directory organized like this:

  1. .
  2. ├── obj.data ini-like file with dataset stats and paths
  3. ├── obj.names labels, each label has new line
  4. ├── train.txt lists all the images
  5. └── obj_train_data directory containing images and bounding box txt files
  6. ├── frame_000000.PNG image
  7. ├── frame_000000.txt bounding boxes for image with same name
  8. ├── frame_000001.PNG etc.
  9. ├── frame_000001.txt
  10. ├── frame_000002.PNG
  11. ├── frame_000002.txt
  12. ├── frame_000003.PNG
  13. └── frame_000003.txt

obj.data

Contains key-value pairs.

Key Description
classes Number of labels
train path to train.txt (relative to “data”, the main directory)
names path to label names (relative to “data”, the main directory)
backup ???
  1. classes = 3
  2. train = data/train.txt
  3. names = data/obj.names
  4. backup = backup/

obj.names

Each label that appears in the dataset.

  1. label1
  2. label2
  3. label3

train.txt

Paths to every image frame relative to “data” (main directory).

  1. data/obj_train_data/frame_000000.PNG
  2. data/obj_train_data/frame_000001.PNG
  3. data/obj_train_data/frame_000002.PNG
  4. data/obj_train_data/frame_000003.PNG
  5. data/obj_train_data/frame_000004.PNG
  6. data/obj_train_data/frame_000005.PNG

obj_train_data/frame_XXXXXX.PNG

Each frame of the video, or each image of the dataset.

obj_train_data/frame_XXXXXX.txt

Lists all the bounding boxes of the image. Each line is a bounding box. The line represents
the <label index (starting at 1)> <leftmost X position> <topmost Y position> <width> <height>.

The unit of the X, Y, Width and Height are all fractions of the image. So for example, if you have an
image that is (1000 width, 800 height), and has a bounding box that starts at position (100px from left, 200px from top) with a width of 250 pixels and a height of 300 pixels. Let’s say this box uses the second label. You would have the following YOLO line:

2 0.1 0.25 0.25 0.375

Here’s the step in-between: 2 100/1000 200/800 250/1000 300/800

  1. 1 0.813552 0.562875 0.033875 0.035104
  2. 2 0.813552 0.562875 0.033875 0.035104