Abstract:
In this thesis, speaker verification and audio diarization tasks are studied. The aim of speaker verification is to determine whether two utterances are spoken by same speaker. Investigators from many research group participate in the annual Speaker Recognition Evaluations (SRE) which is organized by the National Institute of Standards and Technology (NIST) in order to analyze the performance of various methods. In 2010, three groups from Turkey including Bo˘gazi¸ci University participated in the evaluation. Two baseline systems were developed for this evaluation and acceptable system performance was obtained for the first time submission. A problem with SRE 2010 is that development data for microphone case is sparse. Use of sufficient amount of telephone data in conjunction with limited microphone data is investigated to improve system performance of microphone conditions. The diarization is task of explanation of all sources in an audio. Turkish Broadcast News data is utilized in this task. Baseline and factor analysis based systems are developed and a comparative study between these two systems is reported. It has been shown that performance of speech recognition systems can be improved by adaptation of speakers whose data can be obtained via automated audio diarization. A similar study is performed using Turkish Broadcast News data. Lastly, a novel algorithm is proposed for segmentation of simultaneous speech segments. It is shown in the experiments that the proposed approach improved the overall system performance.