Archives and Documentation Center
Digital Archives

Multivariate decision trees for machine learning

Show simple item record

dc.contributor Graduate Program in Computer Engineering.
dc.contributor.advisor Alpaydın, Ethem.
dc.contributor.author Yıldız, Olcay Taner.
dc.date.accessioned 2023-03-16T10:00:07Z
dc.date.available 2023-03-16T10:00:07Z
dc.date.issued 2000.
dc.identifier.other CMPE 2000 Y55
dc.identifier.uri http://digitalarchive.boun.edu.tr/handle/123456789/12145
dc.description.abstract In this thesis, we detail and compare univariate, linear decision tree methods using a set of simulations on twenty standard data sets. For univariate decision tree methods, we have used the ID3 algorithm and for multivariate decision tree methods, we have used the CARD algorithm. For linear and nonlinear methods, we have used neural networks at each decision node. We also propose to use the LDA algorithm in constructing linear multivariate trees. Univariate decision trees at each decision node consider the value of only one feature leading to axis-aligned splits. In a linear multivariate decision tree, each decision node divides the input space into two with an arbitrary hyperplane leading to oblique splits. In a nonlinear one, a multilayer perceptron at each node divides the input space arbitrarily, at the expense of increased complexity. We propose hybrid trees where the decision node may be linear or nonlinear depending on the outcome of a statistical test on accuracy. We also propose to use linear discriminant anlysis at each decision node. Our results indicate that if the data set is small and has few classes, then a univariate technique does not overfit and can be sufficient and the univariate ID3 has better performance than multivariate linear methods. ID3 learns fast, learns simple and interpretable rules. If the variables are highly correlated, then the univariate method is not sufficient and we may resort to multivariate methods. We have shown that ID_LDA has better performance than CART in terms of accuracy, node size and very significantly in learning time. It has also smaller learning time than ID-LP and the same accuracy. ID-LDA generates smaller trees than ID3 and CART. This shows that to generate a linear multivariate tree, using ID-LDA is preferable over CART, and may be preferable over ID-LP if learning time is critical.
dc.format.extent 30 cm.+
dc.publisher Thesis (M.S.)- Bogazici University. Institute for Graduate Studies in Science and Engineering, 2000.
dc.relation Includes appendices.
dc.relation Includes appendices.
dc.subject.lcsh Machine learning.
dc.subject.lcsh Multivariate analysis.
dc.title Multivariate decision trees for machine learning
dc.format.pages xiv, 136 leaves;


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Digital Archive


Browse

My Account