Supervised, semi-supervised and unsupervised methods in discriminative language modeling for automatic speech recognition

Dikici, Erinç.

Arşiv ve Dokümantasyon Merkezi Dijital Arşivi Ana Sayfası
→
Boğaziçi Üniversitesi Tezleri
→
Fen Bilimleri Enstitüsü
→
Elektrik- Elektronik Mühendisliği
→
Ph.D. Theses
→
Öğe Göster

Supervised, semi-supervised and unsupervised methods in discriminative language modeling for automatic speech recognition

Dikici, Erinç.

URI: http://digitalarchive.boun.edu.tr/handle/123456789/13137

Tarih: 2016.

Özet:

Discriminative language modeling aims to reduce the error rates by rescoring the output of an automatic speech recognition (ASR) system. Discriminative language model (DLM) training conventionally follows a supervised approach, using acoustic recordings together with their manual transcriptions (reference) as training examples, and the recognition performance is improved with increasing amount of such matched data. In this thesis we investigate the case where matched data for DLM training is limited or not available at all, and explore methods to improve ASR accuracy by incorporating unmatched acoustic and text data that come from separate sources. For semi-supervised training, we utilize weighted nite-state transducer and machine translation based confusion models to generate arti cial hypotheses in addition to the real ASR hypotheses. For unsupervised training, we explore target output selection methods to replace the missing reference. We handle discriminative language modeling both as a structured prediction and a reranking problem and employ variants of the perceptron, MIRA and SVM algorithms adapted for both problems. We propose several hypothesis sampling approaches to decrease the complexity of algorithms and to increase the diversity of arti cial hypotheses. We obtain signi cant improvements over baseline ASR accuracy even when there is no transcribed acoustic data available to train the DLM.

Tüm öğe kaydını göster