Özet:
Automatic text summarization is the task of generating a compact and coherent version of a given text document or a set of text documents. Although there is a vast number of studies for automatic document summarization on English, there is only a limited number of studies for other languages, especially for Turkish. Text simpli cation aims to reduce the grammatical or lexical complexities of the sentences. Automatic text simpli cation systems can be an important part of any NLP task to improve system performance. In this thesis, we analyzed the e ects of applying di erent levels of stemming approaches such as xed-length word truncation and morphological analysis and the e ects of applying text simpli cation techniques for multi-document summarization (MDS) on Turkish, which is an agglutinative and morphologically rich language. We constructed a manually annotated MDS data set, and to the best of our knowledge, reported the rst results on Turkish MDS. Additionally, we developed a rule-based text simpli cation system for Turkish that utilizes the syntactic features of the sentences to identify simpli cation patterns. Our results show that a simple xedlength word truncation approach performs slightly better than no stemming, whereas applying complex morphological analysis does not improve Turkish MDS in terms of ROUGE scores. Applying simpli cation rules that split complex sentences to individual simpler sentences as a preprocessing step slightly improves summarization performance, whereas applying a compression-based simpli cation approach relying solely on rule matching decreases the obtained ROUGE scores.