Abstract:
High processor utilization ,which has signi cant impact on the total system performance, is becoming the most popular target of many researchers in computer science. Since processors are extremely fast and much more expensive than other hardware components increasing their utilization is critical for e ciency of computer systems. One of the main reasons that cause processors to wait idle is memory stalls during a data or an instruction reference from the memory hierarchy. Although the fastest memory components and caching technologies are used to decrease access latency , the gap between memory system and cpu speed has been rapidly increasing. In addiditon to development of fast memory components which decrease miss latency and increase bandwidth, many techniques have been proposed to increase hit ratio. Prefetchers are one those which provides memory level parallelism by fetching blocks of data to the cache in advance of cpu requests hoping to increase cache hit ratio. Since caches are the fastest components in the memory hierarchy increasing hit ratio of the caches hides memory latency and therefore increase processor utilization. Scheduling threads according to their cache locality also increases data sharing and cache utilization in multithreaded systems. In this work, we focused these two areas of memory optimization techniques. We proposed a new hardware prefetcher model and a context switching heuristic among threads in multithreaded systems. We also implemented a multilevel cache simulator to test those ideas.