Application mapping and optimization for cmp based architectures

Demiröz, Betül.

Archives and Documentation Center Digital Archives Home
→
Boğaziçi Üniversitesi Tezleri
→
Fen Bilimleri Enstitüsü
→
Bilgisayar Mühendisliği
→
Ph.D. Theses
→
View Item

dc.contributor	Ph.D. Program in Computer Engineering.
dc.contributor.advisor	Tosun, Oğuz.
dc.contributor.advisor	Topçuoğlu, Haluk Rahmi.
dc.contributor.author	Demiröz, Betül.
dc.date.accessioned	2023-03-16T10:13:34Z
dc.date.available	2023-03-16T10:13:34Z
dc.date.issued	2011.
dc.identifier.other	CMPE 2011 D44 PhD
dc.identifier.uri	http://digitalarchive.boun.edu.tr/handle/123456789/12571
dc.description.abstract	Chip Multiprocessors (CMPs) are becoming standard and primary building blocks for personal computers as well as large scale parallel machines, including supercomputers. In this thesis, our main focus is on performance-aware mapping and optimization of application threads onto multicore architectures. Specifically, we propose three different approaches, which are data-to-core mapping methodology, threadto- core mapping methodology, and cache-centric data assignment methodology that includes data-to-thread mapping. For demonstrating data-to-core mapping methodology, we propose two novel parallel formulations for the Barnes-Hut method on the Cell Broadband Engine architecture by considering technical specifications and limitations of the Cell architecture. Our experimental evaluation indicates that the Barnes-Hut method performs much faster on the Cell architecture compared to the reference architecture, an Intel Xeon based system. To present thread-to-core mapping methodology, we propose a framework that uses helper threads running in parallel with application threads, which dynamically observe the behavior of application threads and their data access patterns. These helper threads calculate data sharing among application threads, cluster them to be mapped to available cores, use cache counters to calculate the efficiency of a mapping, and make the mapping decision after considering the execution needs. Our final methodology provides a locality-aware mapping algorithm, which targets to assign computations with similar data access patterns of an application to the same core. Our algorithm divides computations of the application into chunks to provide load balancing, and a set of chunks with high similarity is grouped into bins to provide data locality. We consider the sparse matrix-vector multiplication as the reference application.
dc.format.extent	30 cm.
dc.publisher	Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2011.
dc.relation	Includes appendices.
dc.relation	Includes appendices.
dc.subject.lcsh	Operating systems (Computers).
dc.title	Application mapping and optimization for cmp based architectures
dc.format.pages	xvi, 107 leaves ;