Better methods for configuring case-based reasoning systems

Kocagüneli, Ekrem.

Archives and Documentation Center Digital Archives Home
→
Boğaziçi Üniversitesi Tezleri
→
Fen Bilimleri Enstitüsü
→
Bilgisayar Mühendisliği
→
M.S. Theses
→
View Item

dc.contributor	Graduate Program in Computer Engineering.
dc.contributor.advisor	Bener, Ayşe B.
dc.contributor.author	Kocagüneli, Ekrem.
dc.date.accessioned	2023-03-16T10:00:09Z
dc.date.available	2023-03-16T10:00:09Z
dc.date.issued	2010.
dc.identifier.other	CMPE 2010 K63
dc.identifier.uri	http://digitalarchive.boun.edu.tr/handle/123456789/12148
dc.description.abstract	Software effort estimation has been one of the major challenges in software engineering and previous research has mainly focused on addressing the large deviation problem in estimations by improving prediction accuracies of models. These models are evaluated using measures such as MRE or pred(r), which all assess the models on the basis of overall prediction accuracy. Practitioners and researchers require a software effort estimation model with the following properties: 1)Understand the data that is used to build the model and 2) provide accurate estimations. In our study, we adapt greedy agglomerative clustering algorithm (GAC) to software effort estimation domain and use it as an analogy based estimator to build our model: Tree Estimation and Assessment Knowledge (TEAK). By using GAC based model, TEAK, we are able to provide an analogy number (k) to be used for each individual test project and get lower MRE values than any other k-based method in all datasets. There are multiple problems with case based reasoning (CBR) methods such as feature subset selection, scaling, similarity measure and number analogies to use (suitable k value) [1]. As our intention in this research was to focus on the problem of finding the suitable k value, we do not address other CBR related problems and stick to the dynamic selection of a suitable k value for each single test instance. With TEAK it is possible to better understand the data on which effort estimation is to be done and use different number of analogies (k value) for each test instance. TEAK prunes irrelevant analogies in train set for a test project and there by finds the number of analogies to be used during estimation. This approach has outperformed all other k-based CBR methods in terms of predictive accuracy upto more than 100%.
dc.format.extent	30cm.
dc.publisher	Thesis (M.S.)-Bogazici University. Institute for Graduate Studies in Science and Engineering, 2010.
dc.subject.lcsh	Case-based reasoning.
dc.title	Better methods for configuring case-based reasoning systems
dc.format.pages	xiv, 63 leaves;