Arşiv ve Dokümantasyon Merkezi
Dijital Arşivi

Investigation of automatically derived subword units for Turkish LVCSR

Basit öğe kaydını göster

dc.contributor Graduate Program in Electrical and Electronic Engineering.
dc.contributor.advisor Saraçlar, Murat.
dc.contributor.author Aksungurlu, Tuncay.
dc.date.accessioned 2023-03-16T10:17:10Z
dc.date.available 2023-03-16T10:17:10Z
dc.date.issued 2008.
dc.identifier.other EE 2008 A37
dc.identifier.uri http://digitalarchive.boun.edu.tr/handle/123456789/12716
dc.description.abstract In this thesis, we performed large vocabulary continuous speech recognition (LVCSR) experiments using language models that are built upon different recognition units in order to create a suitable and successful language modeling scheme for Turkish. Since Turkish is an agglutinative language, how you build the language model dras- tically affects the recognition performance. Whereas traditional word based language models give satisfactory results for English; they do not work well for Turkish due to the inductive morphology. Different language modeling strategies, mainly based on sub-word units like morphemes and stem-endings, are proposed in order to overcome this problem. In this work, the sub-words that are derived in an unsupervised manner, are investigated. Segmentation obtained using different approaches are compared due to their performance in speech recognition. The best WER that has been obtained is 25.24 whereas it has been obtained as 26.90 using the word-based language models.
dc.format.extent 30cm.
dc.publisher Thesis (M.S.)-Bogazici University. Institute for Graduate Studies in Science and Engineering, 2008.
dc.subject.lcsh Automatic speech recognition.
dc.subject.lcsh Turkish language -- Morphology.
dc.title Investigation of automatically derived subword units for Turkish LVCSR
dc.format.pages xi, 47 leaves;


Bu öğenin dosyaları

Bu öğe aşağıdaki koleksiyon(lar)da görünmektedir.

Basit öğe kaydını göster

Dijital Arşivde Ara


Göz at

Hesabım