14.arch/AJS.rothman .ls 2 .na .LP Sector Cache Design and Performance Jeffrey Rothman (Professor A. J. Smith) Mitsubishi Electric, MICRO, (NSF) MIP-91-16578, Philips Laboratories/Signetics, and Sun Microsystems Shared memory multiprocessors offer a relatively low-cost means to high (aggregate) performance. The most difficult hardware problem in such machines is to provide sufficient bandwidth to the shared memory while ensuring cache consistency (coherency). Sector caches may be an aid to help limit the bandwidth each processor requires, and also have advantages in minimizing tag storage. A sector cache is a cache in which the block is composed of several sub-blocks, which are not required to be simultaneously present in the cache. By carefully using sector caches, it may be possible to ease the memory bandwidth requirements of the system, and simultaneously reduce the network traffic necessary for cache consistency. It has already been shown to reduce memory traffic in single processor systems. Sector caches also have another advantage: each address tag can be associated with a much larger line (block), without requiring that on a miss, the entire line be transferred. We are in the middle phase of a project to study a variety of issues in multiprocessor shared memory cache design, and of issues related to sector caches. We have implemented a mechanism to generate address traces for multiprocessor applications using a shared memory model of programming. We are currently using these traces to study a number of issues: 1) the performance to be expected from sector caches in a uniprocessor environment; 2) improved designs for sector caches; 3) efficient algorithms for maintaining consistency in shared memory multiprocessors; 4) the effect of programming style and reprogramming on the effectiveness of consistency algorithms; 5) the level of data sharing and consistency traffic that can be expected in "typical" applications; 6) designs for maintaining consistency when there is no shared bus; and 7) how the use of a sector cache can provide a solution to the false sharing problem and also reduce bandwidth requirements.