dc.description.abstract |
Economic events perceive great attention from information retrieval community. As one of the popular practices, language models on economy related textual data are proven to be advantageous for anticipating economic events. However, studies on Turkish stock market with textual sources are still limited as language models focus on popular languages. Fortunately, a significant step is taken on language models via the Transformer architecture, and its novel methodology widened the horizons of Natural Language Processing (NLP) studies for over 100 languages with the help of transfer learning. Ergo, in this study, it is aimed to incorporate both the latest advances and the traditional methods of NLP with machine learning classifiers to foresee the stock movements of the companies publicly traded in BIST market, using their official disclosures. To this end, 69,806 material events disclosures of BIST companies are fetched from Public Disclosure Platform (KAP) and labeled with stock movement directions. During the experiments, announcements are represented with Term Frequency Inverse Document Frequency (TFIDF) vectors and Bi-directional Encoder Representations for Transformers (BERT) embeddings so as to be classified with six different learners, namely Multinomial Naïve Bayes, Logistic Regression, Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), Categorical Boosting (CatBoost), and pre-trained classification layer of the Turkish case of BERT, namely BERTurk. While all setups yielded promising results, best performance is delivered by LightGBM on TFIDF with 39.7% F1-macro score. |
|