14.arch/AJS.himself .ls 2 .na .LP Design and Performance of CPU Cache Memories Professor A. J. Smith MICRO, Mitsubishi Electric, (NSF) MIP-9116578, Philips Laboratories/Signetics, and Sun Microsystems Memory hierarchy performance, and cache memory performance in particular, is the limiting factor in most or all modern computers. The machine cycle time is usually limited by the cache access time. Cache misses and slow main memory access times significantly increase mean instruction execution time. Shared bus multiprocessor designs are limited in overall performance by bus bandwidth, and bus traffic declines with cache effectiveness. Finally, multiprocessing in an environment with caches requires a mechanism to maintain memory consistency. This research includes a number of projects relating to cache memories. We plan to do the following: 1) develop and evaluate hardware-based cache consistency algorithms and extend their utility beyond common bus designs, 2) develop and evaluate efficient algorithms for trace-driven simulation (beyond those already known), 3) quantify the effect of cache parameter selection, 4) investigate the design and use of caches in a vector processor, 5) study the effect of bus design and its proper use to maximize performance, 6) further develop and evaluate software-based cache consistency algorithms, 7) investigate the feasibility of virtually addressed cache designs in realistic systems, 8) evaluate cache performance via hardware measurement, 9) further characterize program behavior and cache workloads, 10) develop and evaluate better cache pre-fetch algorithms, 11) examine possible optimizations for caches or ROMs with unchanging code (e.g., microcode caches), 12) evaluate the utility and design of instruction buffers in systems that may also have instruction caches, 13) evaluate the utility of multilevel caches as a function of technology, 14) look at the most appropriate designs for cache controller chips, (o) study TLB design, 15) study the performance of and improvements to set associative cache design, and 16) look at cache design and performance in the context of new simplified instruction set architectures.