Cross-lingual voice conversion

Türk, Oytun.

Archives and Documentation Center Digital Archives Home
→
Boğaziçi Üniversitesi Tezleri
→
Fen Bilimleri Enstitüsü
→
Elektrik- Elektronik Mühendisliği
→
Ph.D. Theses
→
View Item

dc.contributor	Ph.D. Program in Electrical and Electronic Engineering.
dc.contributor.advisor	Arslan, Levent M.
dc.contributor.author	Türk, Oytun.
dc.date.accessioned	2023-03-16T10:24:59Z
dc.date.available	2023-03-16T10:24:59Z
dc.date.issued	2007.
dc.identifier.other	EE 2007 T87 PhD
dc.identifier.uri	http://digitalarchive.boun.edu.tr/handle/123456789/13082
dc.description.abstract	Cross-lingual voice conversion refers to the automatic transformation of a source speaker’s voice to a target speaker’s voice in a language that the target speaker can not speak. It involves a set of statistical analysis, pattern recognition, machine learning, and signal processing techniques. This study focuses on the problems related to cross-lingual voice conversion by discussing open research questions, presenting new methods, and performing comparisons with the state-of-the-art techniques. In the training stage, a Phonetic Hidden Markov Model based automatic segmentation and alignment method is developed for cross-lingual applications which support textindependent and text-dependent modes. Vocal tract transformation function is estimated using weighted speech frame mapping in more detail. Adjusting the weights, similarity to target voice and output quality can be balanced depending on the requirements of the cross- lingual voice conversion application. A context-matching algorithm is developed to reduce the one-to-many mapping problems and enable nonparallel training. Another set of improvements are proposed for prosody transformation including stylistic modeling and transformation of pitch and the speaking rate. A high quality cross-lingual voice conversion database is designed for the evaluation of the proposed methods. The database consists of recordings from bilingual speakers of American English and Turkish. It is employed in objective and subjective evaluations, and in case studies for testing new ideas in cross- lingual voice conversion.
dc.format.extent	30cm.
dc.publisher	Thesis (Ph.D.)-Bogazici University. Institute for Graduate Studies in Science and Engineering, 2007.
dc.relation	Includes appendices.
dc.relation	Includes appendices.
dc.subject.lcsh	Conversion.
dc.subject.lcsh	Speech processing systems.
dc.title	Cross-lingual voice conversion
dc.format.pages	xvii, 152 leaves;