17.op/AJS.tockey .ls 2 .na .LP Disk Caching Barbara Tockey Zivkov (Professor A. J. Smith) Digital Equipment, IBM, MICRO, and (NSF) CCR-91-17028 We are investigating the use of semiconductor memories as high-speed caches for disk data in computer memory hierarchies. As CPU speeds increase at a greater rate than disk I/O speeds, the so-called "access gap" widens, and maximizing CPU utilization depends more on the efficient management of disk data. We have collected traces of disk I/O activity from computer systems of varying workloads, from technical computing in a research environment to production use of large databases in a commercial environment. To our knowledge, no one has amassed data from such a broad range of computing systems for analysis in this manner. We are analyzing this data, characterizing the I/O activity to compare the behavior of these apparently very different systems with respect to aspects such as locality of reference and sequentiality. We are using trace-driven simulation to study the effectiveness of common disk caching policies and cache configurations, reporting results by such measures as miss ratio and traffic ratio. We have simulated well-known disk caching approaches such as least recently used and working set. We have also simulated previously proposed policies which, for example, dynamically determine the amount to pre-fetch, and have extended these policies, for example, by determining pre-fetched amount on a per-file, per-transaction, and per-file/transaction pair. We have simulated the effects of limiting a transaction's allocation of cache buffers and have investigated the effects of write policy on the traffic ratios.