A repo for CS4347 group project on sound event detection.
This repo is designed for CS4347 Group Project. It is dedicated to provide a quick start for beginners who are interested in Sound Event Detection (SED). The goal of SED is to classify the events of a recording into a set of provided classes and to identify their corresponding time boundaries.
The dataset can be directly downloaded from https://goo.gl/PJUVAd (for training set) and https://goo.gl/ip8JXW (for testing set). After downloading, users should prepare the data looks like:
dataset_root
├── training (51172 audios)
│ └── ...
├── testing (488 audios)
│ └── ...
└── metadata
├── testing_set.csv
├── groundtruth_strong_label_testing_set.csv
├── groundtruth_weak_label_testing_set.csv
├── groundtruth_weak_label_training_set.csv
└── training_set.csv
pip install -r requirements.txt
sh ./runme.sh
runme.sh include:
Jinhua Liang (liangjh0903@gmail.com)
[1] Mesaros, A. , Heittola, T. , Diment, A. , Elizalde, B. , & Virtanen, T. . (2017). DCASE 2017 Challenge setup: Tasks, datasets and baseline system. Detection & Classification of Acoustic Scenes & Events.
Code: https://github.com/ankitshah009/Task-4-Large-scale-weakly-supervised-sound-event-detection-for-smart-cars#1-direct-download-for-the-audio-of-the-development-and-evaluation-sets
[2] Kong, Qiuqiang, Yong Xu, Wenwu Wang, and Mark D. Plumbley. “Sound Event Detection of Weakly Labelled Data with CNN-Transformer and Automatic Threshold Optimization.” arXiv preprint arXiv:1912.04761 (2019).
Code: https://github.com/qiuqiangkong/sound_event_detection_dcase2017_task4
[3] Xu, Y. , Kong, Q. , Wang, W. , & Plumbley, M. D. . (2018). Large-scale weakly supervised audio classification using gated convolutional neural network.
Code: https://github.com/yongxuUSTC/dcase2017_task4_cvssp