Deep neural networks employing multi-task learning and stacked bottleneck features for speech synthesis.pdf


立即下载 v-star*위위
2024-04-02
features synthe speech DNNs hidden complex stacked bottleneck learning works
200.9 KB

DEEP NEURAL NETWORKS EMPLOYING MULTI-TASK LEARNING AND STACKED
BOTTLENECK FEATURES FOR SPEECH SYNTHESIS
Zhizheng Wu Cassia Valentini-Botinhao Oliver Watts Simon King
Centre for Speech Technology Research, University of Edinburgh, United Kingdom
ABSTRACT
Deep neural networks (DNNs) use a cascade of hidden representa-
tions to enable the learning of complex mappings from input to out-
put features. They are able to learn the complex mapping from text-
based linguistic features to speech acoustic features, and so perform
text-to-speech synthesis. Recent results suggest that DNNs can pro-
duce more natural synthetic speech than conventional HMM-based
statistical parametric systems. In this paper, we show that the hidden
representation used within a DNN can be improved through the use
of Multi-Task Learning, and that stacking multiple frames of hid-
den layer activations (stacked bottleneck features) also leads to im-
provements. Experimental results confirmed the effectivene


features/synthe/speech/DNNs/hidden/complex/stacked/bottleneck/learning/works/ features/synthe/speech/DNNs/hidden/complex/stacked/bottleneck/learning/works/
-1 条回复
登录 后才能参与评论
-->