Sunday 19 October 2008

Reading 7. HTM. Herlihy and Moss 1993. Transactional memory

Herlihy and Moss, ISCA 1993: Transaction memory

This paper coined the term transactional memory, and identified the use of cache mechanisms for performing optimistic synchronization to multiple memory locations in a shared-memory multiprocessor. This was in contrast to the Oklahoma Update proposal that used reservation registers instead of a transactional cache.

Programming Interface
The proposal introduces six new instructions: load-transactional, load-transactional-exclusive, store-transactional, commit, abort, and validate (all those commands are pretty self-describing). The programmer uses these instructions for lock-free data structures and was responsible for saving register state and for ensuring forward progress. Transactions is expected to be short lived and complete in one scheduling quantum.

The following code sequence demonstrates the use of the new instructions to insert an element into a doubly linked list:

// Usage of new instructions to construct data structure
typedef struct list_elem {
  struct list_elem *next;
  struct list_elem *prev;
  int value;
} entry;

entry *Head, *Tail;

void Enqueue(entry* new) {
  entry *old_tail;
  unsigned backoff = BACKOFF_MIN;
  unsigned wait;
  new->next = new->prev = NULL;
  while (TRUE) {
    old_tail = (entry*) LOAD_TRANSACTIONAL_EXCLUSIVE(&Tail);
    if (VALIDATE()) { // ensure transaction still valid
      STORE_TRANSACTIONAL(&new->prev, old_tail);
      if (old_tail == NULL) {
        STORE_TRANSACTIONAL(&Head, new);
      } else {
        STORE_TRANSACTIONAL(&old_tail->next, new);
      }

      STORE_TRANSACTIONAL(&Tail, new);
      if (COMMIT()) // try to commit
        return;
    }
    
    wait = random() % (01 << backoff);
    while (wait--);
    if (backoff < BACKOFF_MAX)
      backoff++;
  }
}

Implementation

A regular CPU is adjusted with fully-associative transactional cache just to store per-transaction data. Also processor equipped with two flags for saving transaction state. The next are not actually new: cache coherence protocol enhanced with handling new cache and flags, sniffing on other processors doings. The paper itself digs into complete description of that protocol including state machine.

As transactional cache is limited in size and can grew too large the authors proposed to use software emulation of larger cache. This allows using hardware for the common case and software for the exceptional case.

Classification

Strong or weak isolationStrong
Transaction granularityCache line
Direct or deferred updateDeferred (in cache)
Concurrency controlOptimistic
Conflict detectionEarly
Inconsistent readsYes
Conflict resolutionReceiver NACKs/Requestor software backoff
Nested transactionN/A

Materials
  • J.R. Larus, R.Rajwar Transactional Memory, 4.3.5
  • M. Herlihy and J. E. B. Moss, “Transactional memory: Architectural support for lock-free data structures,” In Proc. 20th Annu. Int. Symp. on Computer Architecture, pp. 289–300 May 1993