项目作者: diver-j

项目描述 :
MelGAN Multi GPU Implementation.
高级语言: Python
项目地址: git://github.com/diver-j/melgan-multi.git
创建时间: 2019-10-20T14:33:54Z
项目社区:https://github.com/diver-j/melgan-multi

开源协议:

下载


MelGAN Multi GPU Implementation.

PyTorch implementation of MelGAN: Generative Adversarial Networks for
Conditional Waveform Synthesis
.

This implementation includes distributed support and uses the
LJSpeech dataset.

Pre-requisites

  1. NVIDIA GPU + CUDA cuDNN
  2. Python >= 3.5
  3. PyTorch == 1.2
  4. Clone this repository.
  5. Install python requirements. Please refer requirements.txt

Dataset

  1. Download and extract the LJ Speech dataset
  2. Move all wav files to data/LJSpeech-1.1/wavs

Training

  1. python train.py --config=config.json --cps=cp_melgan

Default checkpoint directory is cp_melgan

Tensorboard logs will be saved in cp_melgan/logs

Multi-GPU (distributed) Training

  1. python distributed.py --config=config.json --args_str="--cps=cp_melgan"

Training code detects all GPUs and sets them automatically.

Generated Sample Audio

  1. Current Samples (489K Steps)

    Samples

  2. Sample audio can be heard on the tensorboard also.
    validation_audio

  1. Generated spectrograme can be seen on the tensorboard.
    validation_audio

Pre-trained model

Coming soon..

Inference

I will commit a full inference code soon.

Acknowledgements

I referred to waveglow
to implement audio preprocessing.