Automated response generation for corporate chatbot systems

Güser, Abdullah Şamil.

Arşiv ve Dokümantasyon Merkezi Dijital Arşivi Ana Sayfası
→
Boğaziçi Üniversitesi Tezleri
→
Fen Bilimleri Enstitüsü
→
Elektrik- Elektronik Mühendisliği
→
M.S. Theses
→
Öğe Göster

Automated response generation for corporate chatbot systems

Güser, Abdullah Şamil.

URI: http://digitalarchive.boun.edu.tr/handle/123456789/12994

Tarih: 2020.

Özet:

In this work, we have studied the available intent classi cation and response selection methods for designing a customer service chatbot in Turkish language. We have compared the available chatbot types in the literature. As our main focus is the methods that are easy to adapt and implement, we decided to work on closed domain, task oriented, retrieval based type chatbot, as it suited best to our application. We have compared two implementation alternatives, i.e. intent classi cation and response selection. Moreover, the e ect of including the history of the dialog to the training is experimented. We have suggested a classi cation method for labeled datasets using Natural Language Inference. As Turkish is an agglutinative language, most of the available methods in the literature does not perform as good as with languages like English. Therefore, we have conducted experiments with di erent neural network models to observe and compare their performances on various datasets. We compared the performance of state of the arts methods and analyzed the performance of pretrained language models on classi cation and natural language inference tasks. We have studied the factors that makes the performance of Turkish datasets lower than English datasets and de ned basic problems. Then we tried to improve the performance of available methods on Turkish dataset by suggesting solutions to these problems. We have used many Natural Language Processing methods in our experiments. For tokenization speci cally, we compared recent methods in the literature and applied to Turkish language. We experimented the e ect of these methods on both Turkish and English datasets. In the SWDA dataset, we have obtained 75.22% classi cation accuracy on test set and introduced new state of the art. In the XNLI corpus, we have obtained 86.35% NLI accuracy on English test set and 79.85% on the Turkish test set.

Tüm öğe kaydını göster