> Unbounded Transactional Memory > > Charles E. Leiserson > MIT Computer Science and Artificial Intelligence Laboratory > Cambridge, MA 02140 > > Transactional memory, which has been proposed as a desirable > alternative to locking, allows a program to read and modify multiple, > disparate memory locations as a single atomic operation, much as > occurs within a database transaction, but for the primary memory of a > computer system. To date, however, proposed transactional-memory > systems have provided atomic operations on only a limited number of > memory locations. We argue that hardware transactional-memory systems > should support unbounded transactions: transactions of arbitrary size > and duration. > > We have designed a hardware implementation of unbounded transactional > memory, called UTM, which exploits the common case for performance > without sacrificing correctness on transactions whose footprint can be > nearly as large as virtual memory. We have also designed and > performed a cycle-accurate simulation of a simplified and > more-practical architecture, called LTM, which is based on UTM but is > easier to implement, because it does not change the memory subsystem > outside of the processor. LTM supports nearly unbounded transactions > whose footprints are limited by physical memory size and whose > durations are limited by the length of a timeslice. > > We have assessed UTM and LTM through microbenchmarking and by > automatically converting the SPECjvm98 Java benchmarks and the Linux > 2.4.19 kernel to use transactions instead of locks. We used both > cycle-accurate simulation and instrumentation to understand benchmark > behavior. Our studies show that the common case is small transactions > that commit, even when contention is high, but that some applications > contain very large transactions. For example, although 99.9% of > transactions in the Linux study touch 54 cache lines or fewer, some > transactions touch over 8000 cache lines. Our studies also indicate > that hardware support is required, because some applications spend > over half their time in critical regions. Finally, they suggest that > hardware support for transactions can make Java programs run faster > than when run using locks and can increase the concurrency of the > Linux kernel by as much as a factor of 4 with no additional > programming work. > > This research was done jointly with C. Scott Ananian, Krste Asanovic, > Bradley C. Kuszmaul, and Sean Lie.