项目作者： diver-j

项目描述：
MelGAN Multi GPU Implementation.

高级语言： Python

项目主页：

项目地址: git://github.com/diver-j/melgan-multi.git

创建时间： 2019-10-20T14:33:54Z
项目社区：https://github.com/diver-j/melgan-multi
开源协议：
下载

MelGAN Multi GPU Implementation.

PyTorch implementation of MelGAN: Generative Adversarial Networks for
Conditional Waveform Synthesis.

This implementation includes distributed support and uses the
LJSpeech dataset.

Pre-requisites

NVIDIA GPU + CUDA cuDNN
Python >= 3.5
PyTorch == 1.2
Clone this repository.
Install python requirements. Please refer requirements.txt

Dataset

Download and extract the LJ Speech dataset
Move all wav files to data/LJSpeech-1.1/wavs

Training

python train.py --config=config.json --cps=cp_melgan

Default checkpoint directory is cp_melgan

Tensorboard logs will be saved in cp_melgan/logs

Multi-GPU (distributed) Training

python distributed.py --config=config.json --args_str="--cps=cp_melgan"

Training code detects all GPUs and sets them automatically.

Generated Sample Audio

Current Samples (489K Steps)

Samples
Sample audio can be heard on the tensorboard also.

Generated spectrograme can be seen on the tensorboard.

Pre-trained model

Coming soon..

Inference

I will commit a full inference code soon.

Acknowledgements

distributed.py from NVIDIA’s
waveglow.

I referred to waveglow
to implement audio preprocessing.


