Archives and Documentation Center
Digital Archives

Predicting stock movements with machine learning using textual data

Show simple item record

dc.contributor Graduate Program in Management Information Systems.
dc.contributor.advisor Durahim, Ahmet Onur.
dc.contributor.author Özdemir, Meryem.
dc.date.accessioned 2023-03-16T12:51:33Z
dc.date.available 2023-03-16T12:51:33Z
dc.date.issued 2020.
dc.identifier.other MIS 2020 O84
dc.identifier.uri http://digitalarchive.boun.edu.tr/handle/123456789/18105
dc.description.abstract Economic events perceive great attention from information retrieval community. As one of the popular practices, language models on economy related textual data are proven to be advantageous for anticipating economic events. However, studies on Turkish stock market with textual sources are still limited as language models focus on popular languages. Fortunately, a significant step is taken on language models via the Transformer architecture, and its novel methodology widened the horizons of Natural Language Processing (NLP) studies for over 100 languages with the help of transfer learning. Ergo, in this study, it is aimed to incorporate both the latest advances and the traditional methods of NLP with machine learning classifiers to foresee the stock movements of the companies publicly traded in BIST market, using their official disclosures. To this end, 69,806 material events disclosures of BIST companies are fetched from Public Disclosure Platform (KAP) and labeled with stock movement directions. During the experiments, announcements are represented with Term Frequency Inverse Document Frequency (TFIDF) vectors and Bi-directional Encoder Representations for Transformers (BERT) embeddings so as to be classified with six different learners, namely Multinomial Naïve Bayes, Logistic Regression, Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), Categorical Boosting (CatBoost), and pre-trained classification layer of the Turkish case of BERT, namely BERTurk. While all setups yielded promising results, best performance is delivered by LightGBM on TFIDF with 39.7% F1-macro score.
dc.format.extent 30 cm.
dc.publisher Thesis (M.A.) - Bogazici University. Institute for Graduate Studies in the Social Sciences, 2020.
dc.subject.lcsh Stock exchanges -- Computer simulation.
dc.subject.lcsh Stock price forecasting.
dc.subject.lcsh Machine learning -- Mathematical models.
dc.title Predicting stock movements with machine learning using textual data
dc.format.pages x, 86 leaves ;


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Digital Archive


Browse

My Account