项目作者: vliu15

项目描述 :
End-to-end Text-to-Speech with Generative Adversarial Networks
高级语言: Python
项目地址: git://github.com/vliu15/tts-gan.git
创建时间: 2021-01-06T20:55:33Z
项目社区:https://github.com/vliu15/tts-gan

开源协议:MIT License

下载


End-to-end Text-to-Speech with Generative Adversarial Networks

This repository contains implementation and end-to-end training scripts for text-to-speech models, based off
End-to-End Adversarial Text-to-Speech (Donahue et al. 2020).

Usage

To setup the Python environment, run

  1. python -m venv ttsgan
  2. source ttsgan/bin/activate
  3. python -m pip install --upgrade pip
  4. python -m pip install -r requirements.txt

Aggregate audio files from the LJ-Speech dataset by running

  1. ls LJSpeech-1.1/wavs/*.wav | tail -n+10 > train_files.txt
  2. ls LJSpeech-1.1/wavs/*.wav | head -n10 > test_files.txt

Specify the path to the metadata.csv via the --metadata_fileflag. Download the CMU phonemizer dictionary here and specify the path via the --cmudict_file flag.

To train, simply run

  1. python train.py -c config.yml