Crowd - labelling for continuosun - valued annotations

Kara, Yunus Emre.

Arşiv ve Dokümantasyon Merkezi Dijital Arşivi Ana Sayfası
→
Boğaziçi Üniversitesi Tezleri
→
Fen Bilimleri Enstitüsü
→
Bilgisayar Mühendisliği
→
Ph.D. Theses
→
Öğe Göster

dc.contributor	Ph.D. Program in Computer Engineering.
dc.contributor.advisor	Akarun, Lale.
dc.contributor.author	Kara, Yunus Emre.
dc.date.accessioned	2023-03-16T10:13:53Z
dc.date.available	2023-03-16T10:13:53Z
dc.date.issued	2018.
dc.identifier.other	CMPE 2018 K37 PhD
dc.identifier.uri	http://digitalarchive.boun.edu.tr/handle/123456789/12627
dc.description.abstract	As machine learning gained immense popularity across a wide variety of domains in the last decade, it has become more important than ever to have fast and inexpensive ways to annotate vast amounts of data. With the emergence of crowdsourcing services, the research direction has gravitated toward putting ‘the wisdom of crowds’ to use. We call the process of crowdsourcing based label collection crowd-labeling. In this thesis, we focus on crowd consensus estimation of continuous-valued labels. Unfortunately, spammers and inattentive annotators pose a threat to the quality and trustworthiness of the consensus. Thus, we develop Bayesian models taking diﬀerent annotator behaviors into account and introduce two crowd-labeled datasets for evaluating our models. High quality consensus estimation requires a meticulous choice of the candidate annotator and the sample in need of a new annotation. Due to time and budget limitations, it is beneﬁcial to make this choice while collecting the annotations. To this end, we propose an active crowd-labeling approach for actively estimating consensus from continuous-valued crowd annotations. Our method is based on annotator models with unknown parameters, and Bayesian inference is employed to reach a consensus in the form of ordinal, binary, or continuous values. We introduce ranking functions for choosing the candidate annotator and sample pair for requesting an annotation. In addition, we propose a penalizing method for preventing annotator domination, investigate the explore-exploit trade-oﬀ for incorporating new annotators into the system, and study the eﬀects of inducing a stopping criterion based on consensus quality. Experimental results on the benchmark datasets suggest that our method provides a budget and time-sensitive solution to the crowd-labeling problem. Finally, we introduce a multivariate model incorporating cross attribute correlations in multivariate annotations and present preliminary observations.
dc.format.extent	30 cm.
dc.publisher	Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2018.
dc.subject.lcsh	Machine learning.
dc.title	Crowd - labelling for continuosun - valued annotations
dc.format.pages	xxi, 161 leaves ;