Archives and Documentation Center
Digital Archives

Application mapping and optimization for cmp based architectures

Show simple item record

dc.contributor Ph.D. Program in Computer Engineering.
dc.contributor.advisor Tosun, Oğuz.
dc.contributor.advisor Topçuoğlu, Haluk Rahmi.
dc.contributor.author Demiröz, Betül.
dc.date.accessioned 2023-03-16T10:13:34Z
dc.date.available 2023-03-16T10:13:34Z
dc.date.issued 2011.
dc.identifier.other CMPE 2011 D44 PhD
dc.identifier.uri http://digitalarchive.boun.edu.tr/handle/123456789/12571
dc.description.abstract Chip Multiprocessors (CMPs) are becoming standard and primary building blocks for personal computers as well as large scale parallel machines, including supercomputers. In this thesis, our main focus is on performance-aware mapping and optimization of application threads onto multicore architectures. Specifically, we propose three different approaches, which are data-to-core mapping methodology, threadto- core mapping methodology, and cache-centric data assignment methodology that includes data-to-thread mapping. For demonstrating data-to-core mapping methodology, we propose two novel parallel formulations for the Barnes-Hut method on the Cell Broadband Engine architecture by considering technical specifications and limitations of the Cell architecture. Our experimental evaluation indicates that the Barnes-Hut method performs much faster on the Cell architecture compared to the reference architecture, an Intel Xeon based system. To present thread-to-core mapping methodology, we propose a framework that uses helper threads running in parallel with application threads, which dynamically observe the behavior of application threads and their data access patterns. These helper threads calculate data sharing among application threads, cluster them to be mapped to available cores, use cache counters to calculate the efficiency of a mapping, and make the mapping decision after considering the execution needs. Our final methodology provides a locality-aware mapping algorithm, which targets to assign computations with similar data access patterns of an application to the same core. Our algorithm divides computations of the application into chunks to provide load balancing, and a set of chunks with high similarity is grouped into bins to provide data locality. We consider the sparse matrix-vector multiplication as the reference application.
dc.format.extent 30 cm.
dc.publisher Thesis (Ph.D.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2011.
dc.relation Includes appendices.
dc.relation Includes appendices.
dc.subject.lcsh Operating systems (Computers).
dc.title Application mapping and optimization for cmp based architectures
dc.format.pages xvi, 107 leaves ;


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Digital Archive


Browse

My Account