Feature selection and transfer learning algorithms with applications on credit risk analysis

Bozkurt Gönen, Gül Efşan.

Archives and Documentation Center Digital Archives Home
→
Boğaziçi Üniversitesi Tezleri
→
Fen Bilimleri Enstitüsü
→
Bilgisayar Mühendisliği
→
M.S. Theses
→
View Item

Feature selection and transfer learning algorithms with applications on credit risk analysis

Bozkurt Gönen, Gül Efşan.

URI: http://digitalarchive.boun.edu.tr/handle/123456789/12220

Date: 2012.

Abstract:

Many financial organizations such as banks and retailers use computational credit risk analysis (CRA) tools heavily due to recent financial crises and more strict regulations. This strategy enables them to manage their financial and operational risks within the pool of financial institutes. Machine learning algorithms especially binary classifiers are very popular for that purpose. In real-life applications such as CRA, feature selection algorithms are used to decrease data acquisition cost and to increase interpretability of the decision process. Using feature selection methods directly on CRA data sets may not help due to categorical variables such as marital status. Such variables are usually are converted into binary features using 1-of-k encoding and eliminating a subset of features from a group does not help in terms of data collection cost or interpretability. In this thesis, we propose to use the probit classifier with a proper prior structure and multiple kernel learning with a proper kernel construction procedure to perform group-wise feature selection. Experiments on two standard CRA data sets show the validity and effectiveness of the proposed binary classification algorithm variants. Robustness against dynamic conditions such as currency changes is another important property for CRA systems. They should perform reasonably good with limited data after such changes and the best strategy is to exploit existing data using transfer learning. We also extend the probit classifier towards transfer learning by mapping different data sets into a unified subspace and learning a common classifier. Experiments on two standard CRA data sets show the usefulness of transfer learning for such cases.

Show full item record