Access Restriction

Author Yamagishi, Junichi ♦ Watts, Oliver ♦ King, Simon ♦ Usabaev, Bela
Source CiteSeerX
Content type Text
File Format PDF
Language English
Subject Domain (in DDC) Computer science, information & general works ♦ Data processing & computer science
Description In speaker-adaptive HMM-based speech synthesis, there are a few speakers whose synthetic speech sounds worse than that of other speakers, despite having the same amount of adapta-tion data from within the same corpus. This paper investigates these fluctuations in quality and found that as mel-cepstral dis-tance from the average voice becomes larger, the MOS scores generally become worse. Although the negative correlation ob-tained is not strong enough, this helps us improve the training and adaptation strategies for average voice models. Further-more we remark that this correlation is strongly linked to “vocal attractiveness.” Index Terms: speech synthesis, HMM, average voice, speaker adaptation
Educational Role Student ♦ Teacher
Age Range above 22 year
Educational Use Research
Education Level UG and PG ♦ Career/Technical Study
Learning Resource Type Article
Publisher Date 2010-01-01
Publisher Institution in Proc.Interspeech