Abstract:
The morphological analysis of Turkish is the subject of this thesis. Turkish belongs to the group of agglutinative languages. Because of its agglutinative nature, Turkish morphology is quite complex and includes many exceptional cases. Most recent research on Turkish morphology have limited themselves with a partial treatment of the language. The study has concentrated especially on the explanation and representation of the basic rules. The main Objective of this thesis is to bring the full morphological structure of Turkish to light and to build its computer representation. Before this analysis is handled, the syntactic or semantic parsing of the language is quite impossible. In this study, we divide the analysis of the morphology into two interrelated parts: morphophonemic analysis and morpho tactic analysis. We investigate and define the morphological structure for both of these. Then we combine these in the Augmented Transition Network (ATN) formalism. This forms the formal representation of the Turkish morphological structure. This proposed morphological structure forms a basis for the language applications about Turkish. Among these applications, we design and implement a morphological parser and a spelling checker which incorporates a spelling corrector component. We perform statistical analysis of Turkish based on this morphological representation and the implemented programs. This analysis is formed of two parts: lexical and morphological analysis, and corpus analysis. The first one uses the information about the structural parts of the language. The second one deals with the daily usage of the language. For this purpose, we form a corpus and run the spelling checker program on this corpus.|Key words: Computational linguistics, Natural language processing, Morphological analysis, Turkish, Augmented transition networks, Spelling checking, Corpus