项目作者: Nannigalaxy

项目描述 :
Audio preprocessing tool for signal processing and machine learning applications.
高级语言: Python
项目地址: git://github.com/Nannigalaxy/audio-preprocessing-tool.git
创建时间: 2021-02-21T05:27:02Z
项目社区:https://github.com/Nannigalaxy/audio-preprocessing-tool

开源协议:Apache License 2.0

下载


License

audio-preprocessing-tool

Audio preprocessing tool for signal processing and machine learning applications.

Features

  • MFCC
  • Audio data split
  • Audio augmentation (Random pitch, speed, shift and background overlay)

For ML dataset loading

Dataset directory stucture

  1. --wav_dataset
  2. |--yes
  3. |-- y1.wav
  4. |-- y2.wav
  5. .
  6. .
  7. |--no
  8. |-- n1.wav
  9. |-- n2.wav
  10. .
  11. .
  12. |--.background
  13. |-- bg1.wav
  14. |-- bg2.wav

Category: yes, no, …
Background: .background (Need to have same directory name for background as it is hardcoded)

Example:

  1. from audio_preprocess import get_dataset
  2. path = '../input/wav_dataset/'
  3. sampling_rate = 16000
  4. sample_limit = None # None to use all samples in each category
  5. seconds = 1 # Audio length in seconds to consider
  6. mfcc_num = 30
  7. mfcc_max_length = 35
  8. X, Y, dataframe = get_dataset(path, sampling_rate, mfcc_num, mfcc_max_length, seconds, sample_limit)