Archives and Documentation Center
Digital Archives

Machine learning methods in natural language processing

Show simple item record

dc.contributor Graduate Program in Computational Science and Engineering.
dc.contributor.advisor Ecevit, Fatih.
dc.contributor.author Güvenç, Betül.
dc.date.accessioned 2023-03-16T10:02:28Z
dc.date.available 2023-03-16T10:02:28Z
dc.date.issued 2016.
dc.identifier.other CSE 2016 G88
dc.identifier.uri http://digitalarchive.boun.edu.tr/handle/123456789/12326
dc.description.abstract There is a large number of algorithms for keyword extraction and text summarization in natural language processing, as we discuss some of these in this thesis. We started with a survey on automatic text summarization in order to understand the state of the art methods. Also we proposed a new and efficient method for keyword extraction task using Word2Vec and PageRank algorithms. In this thesis, we investigated two di↵erent graph based text summarization algorithms for both single and multi-document settings on di↵erent types of texts where we used LexRank for multi-document summarization and TextRank for single document summarization. We also investigated a number of keyword extraction methods. Almost every keyword extraction method use high dimensional vectors to define words in a vector space. We approached the problem of automatic extraction of keywords from text as a unsupervised learning task and we treat each word in the document as a low dimensional vector. We developed a new keyword extraction method using Word2Vec and PageRank algorithms. Our results show that summarization algorithms give best result on news texts, usable results on legal texts while they give less than optimal results for short stories. On the other hand, we also compared di↵erences in using one-hot-representation and Word2Vec representation but we observed no significant di↵erences between these methods.
dc.format.extent 30 cm.
dc.publisher Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2016.
dc.subject.lcsh Machine learning.
dc.subject.lcsh Natural language processing (Computer science)
dc.title Machine learning methods in natural language processing
dc.format.pages xii, 119 leaves ;


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Digital Archive


Browse

My Account