Archives and Documentation Center
Digital Archives

Biclustering using nonparametric Bayesian methods

Show simple item record

dc.contributor Graduate Program in Computer Engineering.
dc.contributor.advisor Cemgil, Ali Taylan.
dc.contributor.author Çelik, Safiye.
dc.date.accessioned 2023-03-16T10:01:11Z
dc.date.available 2023-03-16T10:01:11Z
dc.date.issued 2012.
dc.identifier.other CMPE 2012 C48
dc.identifier.uri http://digitalarchive.boun.edu.tr/handle/123456789/12225
dc.description.abstract Multiway clustering is a popular analysis method due to its several potential applications. Various techniques have been developed to cluster di erent entities of a data matrix simultaneously by taking relational entries into account. Many of those techniques assume that the number of clusters to be discovered is known prior to the clustering operation. However, in real-world problems we have limited knowledge about the number of clusters before discovering them. Nonparametric methods, on the other hand, perform biclustering and learn the number of clusters concurrently. In this thesis, we introduce two nonparametric Bayesian biclustering methods that are applicable on two-way data. In the rst method we model the rows and columns of the two-way data using Dirichlet Process Mixture Models and cluster them simultaneously, whereas in the second one we cluster the entities separately after applying spectral matrix decomposition on the data. We apply the biclustering algorithms on four di erent datasets; a simulated dataset created by a generative Gaussian model, a dataset of animals and their attributes, a cross-national trade and diplomacy dataset with ve different relational networks, and a biological dataset from a microarray study of lung cancer. Since there are few real world data annotated with ground truth biclusters, we generally utilize link prediction in order to evaluate biclustering performances. We randomly remove data entries and predict them based on the fact that the entries in the same bicluster are similar to each other. First biclustering method results in higher accuracy since it makes use of all relational information in the data while the spectral method reduces dimensionality of the data prior to the clustering operation. On the other hand, computational complexity of spectral method is far less due to the reduction in the data entries to process.
dc.format.extent 30 cm.
dc.publisher Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2012.
dc.relation Includes appendices.
dc.relation Includes appendices.
dc.subject.lcsh Bayesian statistical decision theory.
dc.title Biclustering using nonparametric Bayesian methods
dc.format.pages xiv, 78 leaves ;


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Digital Archive


Browse

My Account