Archives and Documentation Center
Digital Archives

WEB mining issues: topic finding and focused crawling evaluation

Show simple item record

dc.contributor Graduate Program in Management Information Systems.
dc.contributor.advisor Badur, Bertan Yılmaz.
dc.contributor.author Uluhan, Eray.
dc.date.accessioned 2023-03-16T12:52:00Z
dc.date.available 2023-03-16T12:52:00Z
dc.date.issued 2006.
dc.identifier.other MIS 2006 U48
dc.identifier.uri http://digitalarchive.boun.edu.tr/handle/123456789/18197
dc.description.abstract Web mining is defined as the process of using data mining techniques to automatically discover and extract information from semi-or unstructured Web documents and services. This study on Web mining consists of two sections, covering Web structure mining and Web content mining. In the first section, mostwidely accepted focused crawling algorithms and simple tree traversing algorithms are compared based on their page relevance, keyword predicate satisfaction and hitratio criteria. Using the URL tokens as an input resulted in higher performances for all criteria. In the second part, an automatic topic finding methodology through Web pages is proposed. Processing only list items on HTML pages returned from a search engine, it is expected to find related key concepts on a user-defined topic. The methodology is experimented using different parameters, such as number of pages, different keywords, stemming implementations, etc. The candidate concepts ordered in relevancy scores represent a high precision on user-defined topic.
dc.format.extent 30cm.
dc.publisher Thesis (M.A.)-Bogazici University. Institute for Graduate Studies in Social Sciences, 2006.
dc.relation Includes appendices.
dc.relation Includes appendices.
dc.subject.lcsh Data mining.
dc.subject.lcsh Web databases.
dc.subject.lcsh Management information systems.
dc.title WEB mining issues: topic finding and focused crawling evaluation
dc.format.pages x, 70 leaves;


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Digital Archive


Browse

My Account