Özet:
As digital technologies expanded our range of online communication, textual a ect analysis become the focus of considerable interest in several disciplines. While many studies have been conducted in English language, there are only a few tools speci c for the Turkish language. One reason for that is the lack of lexical resources for a ect analysis and annotated corpora for an accurate evaluation. In this thesis, we develop an approach for continuous and dimensional a ect analysis of Turkish communications by combining several tools. We conduct experiments on various text corpora in Turkish including online multi-party chat records, psychotherapy records, Twitter data, movie reviews, and teachers' comments on high school students. Analyzing such texts brings challenges like non-standard word usage, grammatical irregularities, abbreviation usage, and spelling mistakes. We propose several pre-processing steps to deal with these. Then, we adapt an a ective word dictionary from English to Turkish, and by expanding it with synsets, obtain 15,200 words with annotations for valence, arousal, and dominance. We also employ a list of frequently used abbreviations, emoticons, interjections, modi ers (intensi ers and diminishers), and other linguistic indicators to capture the overall a ective state at the sentence level. We recruit and train annotators to obtain a ective ground truth speci cally for multi-party chat records. Our results show that the proposed system is useful, yet there is much room for improvement at di erent stages.