Abstract:
This study describes a question answering (QA) system developed with deep learning methods on spoken documents. Question answering on spoken documents is mainly performed by transcribing spoken content with an automatic speech recognition (ASR) system and then applying text-based question answering methods to the ASR transcripts. The questions are presented to the system in written or spoken form and the answers are returned from spoken documents. QA task on spoken documents is more challenging than on text documents. Firstly, spoken documents do not have ex plicit paragraph boundaries so that end-to-end neural network based systems that uti lize question-aware passage representations for retrieving answers may perform poorly. Secondly, ASR transcripts can be erroneous and this effects the performance of QA systems. Therefore we propose two novel approaches to handle these problems in deep learning based end-to-end spoken QA systems. First approach handles the absence of passage boundaries in spoken documents by generating pseudo passages and auto matically determining questions related with each pseudo passage. Second approach integrates ASR system output, confusion networks with word confidence scores, into a question answering system. Our proposed approaches are integrated into an end-to-end system and we investigate the capability of these approaches with two newly curated QA datasets. The end-to-end neural network model with the proposed extensions has proven to be effective in spoken QA task and improved the QA performance on spoken documents compared to directly applying the end-to-end model to the ASR transcripts of the spoken documents.