Model based multiple audio sequence alignment

Başaran, Doğaç.

Archives and Documentation Center Digital Archives Home
→
Boğaziçi Üniversitesi Tezleri
→
Fen Bilimleri Enstitüsü
→
Elektrik- Elektronik Mühendisliği
→
Ph.D. Theses
→
View Item

dc.contributor	Ph.D. Program in Electrical and Electronic Engineering.
dc.contributor.advisor	Anarım, Emin.
dc.contributor.advisor	Cemgil, Ali Taylan.
dc.contributor.author	Başaran, Doğaç.
dc.date.accessioned	2023-03-16T10:25:12Z
dc.date.available	2023-03-16T10:25:12Z
dc.date.issued	2015.
dc.identifier.other	EE 2015 B37 PhD
dc.identifier.uri	http://digitalarchive.boun.edu.tr/handle/123456789/13126
dc.description.abstract	It is increasingly more common that an occasion is recorded by multiple individuals with the proliferation of recording devices such as smart phones. When properly aligned, these recordings may provide several audio and visual perspectives to a scene which leads to several applications in restoring, remastering and remixing frameworks in various fields. In this study, we interpret the problem of aligning multiple unsynchronized audio sequences in a probabilistic framework. In this manner, we propose a novel, model based approach where we define a template generative model. We define 6 different generative models using this template covering basically all kinds of features (real valued, positive, binary and categorical). Proper scoring functions that evaluates the quality of an alignment are derived from each model where we are able to penalize non-overlapping alignments and alignment of a single sequence against a pre-aligned sequences. Having defined a cost or score function, a heuristic sequential search algorithm and a Gibbs sampler approach are proposed to find the optimum alignment of sequences on the surfaces defined by derived score functions. In addition we propose a multi resolution alignment algorithm where we combine Sequential Monte Carlo (SMC) samplers and proposed sequential search method. The models and appropriate features are exhaustively evaluated with artificial and real-life data sets. The simulation results suggest that the approach is able to handle difficult, ambiguous scenarios and partial matchings where simple baseline methods such as correlation fail.
dc.format.extent	30 cm.
dc.publisher	Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2015.
dc.subject.lcsh	Information storage and retrieval systems.
dc.title	Model based multiple audio sequence alignment
dc.format.pages	xviii, 116 leaves ;