Abstract:
Multiple Instance Learning (MIL) is a weakly supervised approach that focuses on the labeling of a set of instances (i.e. bags) where the label information of in dividual instances is generally unknown. Many of the earlier MIL studies focus on certain assumptions regarding the relationship between the bag and instance labels and devise supervised learning approaches. With the ambiguity in instance labels, these studies fail to generalize to the MIL problems with complex structures. To avoid these problems, researchers focus on embedding instance- level information to learn bag representations. In this context, dissimilarity-based representations are known to gen eralize well. This thesis proposes a novel framework in which each bag is represented by its dissimilarities to the prototypes. The framework consists learning mechanisms that provide fast and competitive results compared to the existing distance-based ap proaches on extensive benchmark data sets. The first approach is a simple model that provides a prototype generator from a given MIL data set. We aim to find out prototypes in the feature space to map the collection of instances (i.e. bags) to a dis tance feature space and simultaneously learn a linear classifier for MIL. The second proposal is a tree-based ensemble learning strategy that avoids complex tuning pro cesses and heavy computational costs without sacrificing accuracy. The framework is enriched with the integration of the methods, parameter selection strategy, and en semble design. Furthermore, the proposed methods are extended to the regression domain, namely Multiple Instance Regression (MIR). MIR is a less commonly studied area where the bag labels are real valued data instead of classes. The experiments show that the performances of all proposals are better than the state-of-the art approaches in the literature.