项目作者: VisionBrain
项目描述 :
Open Source Implementation of Neural Voice Cloning with Few Audio Samples (Baidu Research)
高级语言: Python
项目地址: git://github.com/VisionBrain/Neural_Voice_Cloning.git
Neural_Voice_Cloning
- Baidu Research Link
- Tested Speaker Audio Link
Abstract :
- Voice cloning is a highly desired feature for personalized speech interfaces. We introduce a neural voice cloning system that learns to synthesize a person’s voice from only a few audio samples. System that learns to synthesize a person’s voice from only a few audio samples. We study two approaches: speaker adaptation and speaker encoding.
- Speaker adaptation is based on fine-tuning a multi-speaker generative model. Speaker encoding is based on training a separate model to directly infer a new speaker embedding, which will be applied to a multi-speaker generative model. Speaker adaptation can achieve slightly better naturalness and similarity, cloning time and required memory for the speaker encoding approach are significantly less, making it more favorable for low-resource deployment.
Steps :

Audio :
Tested Speaker Audio Link
- But don’t expect anything right.
- I won’t make an official complaint.
- They make a selective perception process.
Made By-