Archives and Documentation Center
Digital Archives

Hate speech detection in Turkish news using a transformer-based model enhanced with linguistic features

Show simple item record

dc.contributor Graduate Program in Computer Engineering.
dc.contributor.advisor Özgür, Arzucan.
dc.contributor.author Yüksel, Atıf Emre.
dc.date.accessioned 2023-10-15T06:40:59Z
dc.date.available 2023-10-15T06:40:59Z
dc.date.issued 2022
dc.identifier.other CMPE 2022 Y85
dc.identifier.uri http://digitalarchive.boun.edu.tr/handle/123456789/19690
dc.description.abstract Hate speech directed at ethnicities, nationalities, religious identities, and specific groups has increased not only in social media, but also in print media. This creates a need for automated hate speech detection systems that can quickly review and filter print media content before it is provided to readers if it contains hate speech. However, most of the existing automatic hate speech detection models are limited to detecting hate speech without considering the hate speech target group- specific discourse that is often used in news articles. Moreover, there are few datasets that include Turkish print media articles in the hate speech domain. In this study, a new BERT based model enriched with a set of target-oriented lin guistic features for hate speech detection is proposed. The e↵ects of weighting di↵erent BERT hidden vectors are also investigated, instead of using only the first hidden vector of the BERT encoder, which is the classical approach. New BERT based models that integrate di↵erent attention techniques are proposed for combining hidden vectors. A new preprocessed Turkish dataset for hate speech is also published, in which the target group for all hate speech articles is annotated. Experiments on a comprehensive Turk ish dataset of news articles labeled for hate speech show that competitive performance in terms of accuracy and F1-score is achieved compared to previous approaches.
dc.publisher Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2022.
dc.subject.lcsh Linguistic geography.
dc.subject.lcsh Hate speech -- Social aspects -- Turkey.
dc.subject.lcsh Turkish newspapers.
dc.title Hate speech detection in Turkish news using a transformer-based model enhanced with linguistic features
dc.format.pages xiii, 54 leaves


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Digital Archive


Browse

My Account