Supervised, semi-supervised and unsupervised methods in discriminative language modeling for automatic speech recognition

Dikici, Erinç.

Archives and Documentation Center Digital Archives Home
→
Boğaziçi Üniversitesi Tezleri
→
Fen Bilimleri Enstitüsü
→
Elektrik- Elektronik Mühendisliği
→
Ph.D. Theses
→
View Item

dc.contributor	Ph.D. Program in Electrical and Electronic Engineering.
dc.contributor.advisor	Saraçlar, Murat.
dc.contributor.author	Dikici, Erinç.
dc.date.accessioned	2023-03-16T10:25:17Z
dc.date.available	2023-03-16T10:25:17Z
dc.date.issued	2016.
dc.identifier.other	EE 2016 D55 PhD
dc.identifier.uri	http://digitalarchive.boun.edu.tr/handle/123456789/13137
dc.description.abstract	Discriminative language modeling aims to reduce the error rates by rescoring the output of an automatic speech recognition (ASR) system. Discriminative language model (DLM) training conventionally follows a supervised approach, using acoustic recordings together with their manual transcriptions (reference) as training examples, and the recognition performance is improved with increasing amount of such matched data. In this thesis we investigate the case where matched data for DLM training is limited or not available at all, and explore methods to improve ASR accuracy by incorporating unmatched acoustic and text data that come from separate sources. For semi-supervised training, we utilize weighted nite-state transducer and machine translation based confusion models to generate arti cial hypotheses in addition to the real ASR hypotheses. For unsupervised training, we explore target output selection methods to replace the missing reference. We handle discriminative language modeling both as a structured prediction and a reranking problem and employ variants of the perceptron, MIRA and SVM algorithms adapted for both problems. We propose several hypothesis sampling approaches to decrease the complexity of algorithms and to increase the diversity of arti cial hypotheses. We obtain signi cant improvements over baseline ASR accuracy even when there is no transcribed acoustic data available to train the DLM.
dc.format.extent	30 cm.
dc.publisher	Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2016.
dc.subject.lcsh	Automatic speech recognition.
dc.title	Supervised, semi-supervised and unsupervised methods in discriminative language modeling for automatic speech recognition
dc.format.pages	xix, 103 leaves ;