Show simple item record

dc.contributor Graduate Program in Computer Engineering.
dc.contributor.advisor Üsküdarlı, Suzan.
dc.contributor.author Kalender, Murat.
dc.date.accessioned 2023-03-16T10:00:14Z
dc.date.available 2023-03-16T10:00:14Z
dc.date.issued 2010.
dc.identifier.other CMPE 2010 K35
dc.identifier.uri http://digitalarchive.boun.edu.tr/handle/123456789/12161
dc.description.abstract The exponential growth of documents is challenging the existing search and content management technology. An approach for mitigating this issue is user-generated tags, a simple method by which users associate keywords to documents. However, the improvements, from this approach are limited because tags are i) free from context and form, ii) used for purposes other than description, and iii) often remain ambiguous. Since user tagging is a voluntary action, many documents remain untagged. Finally, the interpretation of the tags associated with documents also remains a challenge. To overcome these challenges, semantic web resources and technologies can be utilized to automatically generate semantic tags. Semantic tags not only re ect document content more accurately, they also enable better search results. Ontology coverage, word sense disambiguation and weighting signi cant ontological entities within a context are key challenges in semantic tagging systems. The leading ontology for the English language, Wordnet, has been successfully used for semantic tagging. However, this approach falls short in tagging documents that refer to new concepts and instances. The main focus of this work is automatically generating semantic tags for arbitrary documents. For this purpose, the rst contribution is an ontological knowledge base platform called UNIpedia. UNIpedia aims to provide a knowledge base with contemporary references. Here, contemporary should be understood as in line with web pace. UNIpedia maps various ontological knowledge bases to WordNet concepts. The Wikipedia and OpenCyc knowledge bases, which are known to contain up to date instances and reliable metadata about them, were mapped to WordNet. A rule based heuristics, which uses the ontological and statistical features of concepts and instances, is introduced for the mapping process. UNIpedia terms may have several senses because of the natural language ambiguity. These so called polysemous terms get di erent meanings according to the context. A term passing in a document cannot be mapped to an UNIpedia concept or instance directly, if the term is polysemous. In order to identify the correct sense of the polysemous terms, an automated semantic tagging system called Semantic TagPrint was devised. Semantic TagPrint is the second contribution of this work that uses a linear time lexical chaining Word Sense Disambiguation algorithm for semantic annotation. In addition, Semantic TagPrint weighs and recommends semantic tags which describe the content of a document well. The semantic annotation and semantic tag weighting algorithms use both semantic and statistical features of UNIpedia. The potential bene ts of Semantic TagPrint are demonstrated by the design and implementation of the Semantic Knowledge Management Tool (SKMT). SKMT is the third contribution of this work that provides a user accessible platform for Semantic TagPrint to semantically tag documents, and performs semantic searches.
dc.format.extent 30cm.
dc.publisher Thesis (M.S.)-Bogazici University. Institute for Graduate Studies in Science and Engineering, 2010.
dc.relation Includes appendices.
dc.relation Includes appendices.
dc.subject.lcsh Artificial intelligence.
dc.subject.lcsh Natural language processing (Computer science)
dc.subject.lcsh Information retrieval.
dc.subject.lcsh Semantic Web.
dc.subject.lcsh Knowledge management.
dc.title Automated semantic tagging of text documents
dc.format.pages xv, 107 leaves;


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Digital Archive


Browse

My Account