Archives and Documentation Center
Digital Archives

Speaker adapted speech synthesis with deep neural networks

Show simple item record

dc.contributor Graduate Program in Computer Engineering.
dc.contributor.advisor Özgür, Arzucan.
dc.contributor.advisor Demiroğlu, Cenk.
dc.contributor.author Öztürk, Miraç Göksu.
dc.date.accessioned 2023-03-16T10:03:44Z
dc.date.available 2023-03-16T10:03:44Z
dc.date.issued 2018.
dc.identifier.other CMPE 2018 O97
dc.identifier.uri http://digitalarchive.boun.edu.tr/handle/123456789/12369
dc.description.abstract Text-to-speech (TTS) systems have been an assisting technology since the 1970s. Although commercial use has begun decades ago, synthetic speech quality is still not as good as recorded speech. One particular subject of this field focused by this study is the speaker adaptation in TTS systems. Speaker adaptation is the task of modifying a given TTS model such that the modified model synthesizes speech samples with the voice characteristic of a desired speaker. In this study, deep neural network (DNN) based novel speaker adaptation techniques incorporating transfer learning methods are presented. We replaced the high dimensional speaker embeddings with few dimensional vectors using clustering methods. Objective results indicate significant improvement to the adaptation performance compared to baseline techniques in addition to a significant drop in the number of parameters. The second aspect of this study is the speaker adaptation performed on DNN-based postfiltering methods. The subjective results show that the adaptation of postfiltering increases the similarity of synthetic speech to the desired speaker’s voice although no significant improvement in quality is observed. The techniques proposed in this study are independent of the choice of the DNN architecture and speaker embedding, thus, can be extended and used for experiments of relevant fields such as speech recognition in the future.
dc.format.extent 30 cm.
dc.publisher Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2018.
dc.subject.lcsh Reading -- Remedial teaching -- Aids and devices.
dc.subject.lcsh Text-to-speech software.
dc.title Speaker adapted speech synthesis with deep neural networks
dc.format.pages xvii, 82 leaves ;


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Digital Archive


Browse

My Account