项目作者: JinhuaLiang

项目描述 :
A repo for CS4347 group project on sound event detection.
高级语言: Python
项目地址: git://github.com/JinhuaLiang/CS4347_SED_GroupProject.git
创建时间: 2021-02-16T14:26:34Z
项目社区:https://github.com/JinhuaLiang/CS4347_SED_GroupProject

开源协议:Apache License 2.0

下载


CS4347_SED_GroupProject

This repo is designed for CS4347 Group Project. It is dedicated to provide a quick start for beginners who are interested in Sound Event Detection (SED). The goal of SED is to classify the events of a recording into a set of provided classes and to identify their corresponding time boundaries.

Data preparation

The dataset can be directly downloaded from https://goo.gl/PJUVAd (for training set) and https://goo.gl/ip8JXW (for testing set). After downloading, users should prepare the data looks like:

  1. dataset_root
  2. ├── training (51172 audios)
  3. └── ...
  4. ├── testing (488 audios)
  5. └── ...
  6. └── metadata
  7. ├── testing_set.csv
  8. ├── groundtruth_strong_label_testing_set.csv
  9. ├── groundtruth_weak_label_testing_set.csv
  10. ├── groundtruth_weak_label_training_set.csv
  11. └── training_set.csv

Get start with the provided code

1. Install the required package

  1. pip install -r requirements.txt

2. Run runme.sh

  1. sh ./runme.sh

runme.sh include:

  • Modify the paths to your dataset and workplace.
  • Select the model you want to develop.
  • Pack the waveforms and targets to hdf5 files.
  • Train a specific model.
  • Calculate metrics on testset.
  • (Optional) Inference on the evaluation dataset.

Contact

Jinhua Liang (liangjh0903@gmail.com)

Reference

[1] Mesaros, A. , Heittola, T. , Diment, A. , Elizalde, B. , & Virtanen, T. . (2017). DCASE 2017 Challenge setup: Tasks, datasets and baseline system. Detection & Classification of Acoustic Scenes & Events.

Code: https://github.com/ankitshah009/Task-4-Large-scale-weakly-supervised-sound-event-detection-for-smart-cars#1-direct-download-for-the-audio-of-the-development-and-evaluation-sets

[2] Kong, Qiuqiang, Yong Xu, Wenwu Wang, and Mark D. Plumbley. “Sound Event Detection of Weakly Labelled Data with CNN-Transformer and Automatic Threshold Optimization.” arXiv preprint arXiv:1912.04761 (2019).

Code: https://github.com/qiuqiangkong/sound_event_detection_dcase2017_task4

[3] Xu, Y. , Kong, Q. , Wang, W. , & Plumbley, M. D. . (2018). Large-scale weakly supervised audio classification using gated convolutional neural network.

Code: https://github.com/yongxuUSTC/dcase2017_task4_cvssp