Abstract:
In large vocabulary continuous speech recognition based keyword spotting applications, language modeling directly affects the system performance. In this study, a keyword adapted language model is proposed and a Turkish keyword spotting system is implemented. The proposed language model is compared with two other language models: null-grammar language model as the base model, and a general bigram model. Experiments show that keyword adapted language model gives the best performance in both recall and spotting time. Highest recall rate is 86 per cent. The model that we propose has absolutely increased the recall performance by 4 per cent from the general bigram language model, and by 13 per cent from the null-grammar language model. However, it gives lower precision results than the other systems. Two different methods are used in order to increase the precision of the system. The first method is language model interpolation, which increases the precision, but also decreases the recall. The second method is word insertion penalty adjustment. It is shown that length-adapted adjustment of the word insertion penalty can increase the overall system performance. Finally, a GUIbased computer program that uses the proposed language model is designed and implemented.