<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss'><id>tag:blogger.com,1999:blog-4500206154953507248</id><updated>2009-03-01T08:29:12.625+03:00</updated><title type='text'>Transactional Memory 101</title><subtitle type='html'>Experimental Transactional Memory seminar at Math-Mech SPbSU</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://tm-101.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4500206154953507248/posts/default'/><link rel='alternate' type='text/html' href='http://tm-101.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Mitya</name><uri>http://www.blogger.com/profile/07764279855846194407</uri><email>noreply@blogger.com</email></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>15</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-4500206154953507248.post-4026670442262379512</id><published>2008-10-19T19:13:00.009+04:00</published><updated>2008-10-21T03:23:17.885+04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='readings'/><title type='text'>Reading 7. HTM. Herlihy and Moss 1993. Transactional memory</title><content type='html'>&lt;span style="font-weight: bold;font-size:130%;" &gt;Herlihy and Moss, ISCA 1993: Transaction memory&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;This paper coined the term &lt;span style="font-style: italic;"&gt;transactional memory&lt;/span&gt;, and identified the use of cache mechanisms for performing optimistic synchronization to multiple memory locations in a shared-memory multiprocessor. This was in contrast to the Oklahoma Update proposal that used reservation registers instead of a transactional cache.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;Programming Interface&lt;/span&gt;&lt;br /&gt;The proposal introduces six new instructions: &lt;span style="font-style: italic;"&gt;load-transactional&lt;/span&gt;, &lt;span style="font-style: italic;"&gt;load-transactional-exclusive&lt;/span&gt;, &lt;span style="font-style: italic;"&gt;store-transactional&lt;/span&gt;, &lt;span style="font-style: italic;"&gt;commit&lt;/span&gt;, &lt;span style="font-style: italic;"&gt;abort&lt;/span&gt;, and &lt;span style="font-style: italic;"&gt;validate &lt;/span&gt;(all those commands are pretty self-describing). The programmer uses these instructions for lock-free data structures and was responsible for saving register state and for ensuring forward progress. Transactions is expected to be short lived and complete in one scheduling quantum.&lt;br /&gt;&lt;br /&gt;The following code sequence demonstrates the use of the new instructions to insert an element into a doubly linked list:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;// Usage of new instructions to construct data structure&lt;br /&gt;typedef struct list_elem {&lt;br /&gt;&amp;nbsp;&amp;nbsp;struct list_elem *next;&lt;br /&gt;&amp;nbsp;&amp;nbsp;struct list_elem *prev;&lt;br /&gt;&amp;nbsp;&amp;nbsp;int value;&lt;br /&gt;} entry;&lt;br /&gt;&lt;br /&gt;entry *Head, *Tail;&lt;br /&gt;&lt;br /&gt;void Enqueue(entry* new) {&lt;br /&gt;&amp;nbsp;&amp;nbsp;entry *old_tail;&lt;br /&gt;&amp;nbsp;&amp;nbsp;unsigned backoff = BACKOFF_MIN;&lt;br /&gt;&amp;nbsp;&amp;nbsp;unsigned wait;&lt;br /&gt;&amp;nbsp;&amp;nbsp;new-&gt;next = new-&gt;prev = NULL;&lt;br /&gt;&amp;nbsp;&amp;nbsp;while (TRUE) {&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;old_tail = (entry*) LOAD_TRANSACTIONAL_EXCLUSIVE(&amp;amp;Tail);&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;if (VALIDATE()) { // ensure transaction still valid&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;STORE_TRANSACTIONAL(&amp;amp;new-&gt;prev, old_tail);&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;if (old_tail == NULL) {&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;STORE_TRANSACTIONAL(&amp;amp;Head, new);&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;} else {&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;STORE_TRANSACTIONAL(&amp;amp;old_tail-&gt;next, new);&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;}&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;STORE_TRANSACTIONAL(&amp;amp;Tail, new);&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;if (COMMIT()) // try to commit&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;return;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;}&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;wait = random() % (01 &lt;&lt; backoff);&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;while (wait--);&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;if (backoff &lt; BACKOFF_MAX)&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;backoff++;&lt;br /&gt;&amp;nbsp;&amp;nbsp;}&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;&lt;/code&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;Implementation&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;A regular CPU is adjusted with fully-associative transactional cache just to store per-transaction data. Also processor equipped with two flags for saving transaction state. The next are not actually new: cache coherence protocol enhanced with handling new cache and flags, sniffing on other processors doings. The paper itself digs into complete description of that protocol including state machine.&lt;br /&gt;&lt;br /&gt;As transactional cache is limited in size and can grew too large the authors proposed to use software emulation of larger cache. This allows using hardware for the common case and software for the exceptional case.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;Classification&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;table&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Strong or weak isolation&lt;/td&gt;&lt;td&gt;Strong&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Transaction granularity&lt;/td&gt;&lt;td&gt;Cache line&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Direct or deferred update&lt;/td&gt;&lt;td&gt;Deferred (in cache)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Concurrency control&lt;/td&gt;&lt;td&gt;Optimistic&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Conflict detection&lt;/td&gt;&lt;td&gt;Early&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Inconsistent reads&lt;/td&gt;&lt;td&gt;Yes&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Conflict resolution&lt;/td&gt;&lt;td&gt;Receiver NACKs/Requestor software backoff&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Nested transaction&lt;/td&gt;&lt;td&gt;N/A&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;Materials&lt;/span&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://www.morganclaypool.com/doi/abs/10.2200/S00070ED1V01Y200611CAC002"&gt;J.R. Larus, R.Rajwar &lt;/a&gt;&lt;span style="font-style: italic;"&gt;&lt;a href="http://www.morganclaypool.com/doi/abs/10.2200/S00070ED1V01Y200611CAC002"&gt;Transactional Memory&lt;/a&gt;&lt;/span&gt;&lt;span&gt;, 4.3.5&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-style: italic;"&gt;M. Herlihy and J. E. B. Moss, “Transactional memory: Architectural support for lock-free data structures,” In Proc. 20th Annu. Int. Symp. on Computer Architecture, pp. 289–300 May 1993&lt;br /&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4500206154953507248-4026670442262379512?l=tm-101.blogspot.com'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tm-101.blogspot.com/feeds/4026670442262379512/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=4500206154953507248&amp;postID=4026670442262379512' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4500206154953507248/posts/default/4026670442262379512'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4500206154953507248/posts/default/4026670442262379512'/><link rel='alternate' type='text/html' href='http://tm-101.blogspot.com/2008/10/reading-7-htm-herlihy-and-moss-1993.html' title='Reading 7. HTM. Herlihy and Moss 1993. Transactional memory'/><author><name>Leonid Shalupov</name><uri>http://www.blogger.com/profile/13920834345027590432</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='04513419942137225811'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4500206154953507248.post-170510689153892286</id><published>2008-05-30T02:34:00.076+04:00</published><updated>2008-06-01T10:38:09.725+04:00</updated><title type='text'>Reading 8. Concurrent Haskell and HSTM</title><content type='html'>&lt;span style="font-size:130%;"&gt;&lt;span style="font-weight: bold;"&gt;1. Gentle introduction to IO Monad in Haskell&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Our following talk about model of Software Transactional Memory in Haskell will be meaningless without discussing concepts of concurrent Haskell.&lt;br /&gt;Main ideas of concurrent Haskell were described in paper &lt;span style="font-weight: bold; font-style: italic;"&gt;Concurrent Haskell&lt;/span&gt; by &lt;span style="font-style: italic;"&gt;Simon Peyton Jones, Andrew Gordon&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;Sigbjorn Finne&lt;/span&gt;. In actual fact concurrent Haskell is simple extension of pure lazy-evaluated functional Haskell language. It adds two main new ingredients to Haskell:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;processes, and a mechanism for process initiation&lt;/li&gt;&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;atomically-mutable state, to support inter-process communication and cooperation&lt;/li&gt;&lt;/ul&gt;Following the tradition it may seem strange to talk about some concurrency in pure functional languages like Haskell is because of concurrency concepts suppose existence of mutable entities and states. The most common way to emulate mutable state in Haskell is to wrap our computation result into Monad entity. From this point of view our program become a sequence of new monads creation. This approach helps also to emulate strict sequence of computations which is non-obvious in lazy-evaluated Haskell.&lt;br /&gt;Let's recall what Monad is. Speaking strictly, &lt;span style="font-weight: bold; font-style: italic;"&gt;Monad&lt;/span&gt; is tuple &lt;span style="font-style: italic;"&gt;(M, return, &gt;&gt;=) &lt;/span&gt;where M is type designator, return is operator to wrap our computation into monad entity and &gt;&gt;= or bind operator represents monadic evaluations themselves. Their types are correspondingly&lt;br /&gt;&lt;code&gt;&lt;br /&gt;type Monad = M a&lt;br /&gt;(&gt;&gt;=) :: M a → (a → M b) → M b&lt;br /&gt;return :: a → M a&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;In some it is convenient to use reduced bind operator:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;(&gt;&gt;) :: M a → M b → M b&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;In other words it ignores value wrapped into its first argument.&lt;br /&gt;Most common examples of monads in Haskell are [], Maybe and IO. The last is most interesting for our purposes. In a non-strict language it is completely impractical to perform input/output using “side-effecting functions", because the order in which sub-expressions are evaluated | and indeed whether they are evaluated at all | is determined by the context in which the result of the expression is used, and hence is hard to predict. This difficulty can be addressed by treating an I/O-performing computation as a state transformer; that is, a function that transforms the current state of the world to a new state. In addition, we need the ability for an I/O-performing computation to return a result. This reasoning leads to the following type definition:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;type IO a = World -&gt; (a, World)&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;That is, a value of type IO t takes a world state as input and delivers a modified world state together with a value of type t. Of course, the implementation performs the I/O right away - thereby modifying the state of the world “in place".                                              &lt;br /&gt;We call a value of type IO t an action. Here are two useful complete actions:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;hGetChar :: Handle -&gt; IO Char&lt;br /&gt;hPutChar :: Handle -&gt; Char -&gt; IO ()&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;The action &lt;span style="font-style: italic;"&gt;hGetChar&lt;/span&gt; reads a character from the specified handle (which identifies some file or other byte stream) and returns it as the result of the action. &lt;span style="font-style: italic;"&gt;hPutChar&lt;/span&gt; takes a handle and a character and return an action that writes the character to the specified stream.&lt;br /&gt;Actions can be combined in sequences using infix combinators &gt;&gt; and &gt;&gt;= described above. For example here is an action that reads a character from the standard input, and then prints it twice to the standard output:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;hGetChar stdin        &gt;&gt;= \c -&gt;&lt;br /&gt;hPutChar stdout c    &gt;&gt;&lt;br /&gt;hPutChar stdout c&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;The notation &lt;span style="font-style: italic;"&gt;\c-&gt;E&lt;/span&gt;, for some expression E, denotes a lambda abstraction. In Haskell, the scope of a lambda abstraction extends as far to the right as possible; in this example the body of the \c-abstraction includes everything after the \c. The sequencing combinators, &gt;&gt; and &gt;&gt;=, feed the result state of their left hand argument to the input of their right hand argument, thereby forcing the two actions (via the data dependency) to be performed in the correct order. The combinator &gt;&gt; throws away the result of its first argument, while &gt;&gt;= takes the result of its first argument and passes it on to its second argument. The similarity of monadic I/O-performing programs to imperative programs is no surprise: when performing I/O we specifically want to impose a total order on I/O operations.&lt;br /&gt;It is often also useful to have an action that performs no I/O, and immediately returns a specified value using return operator. For example, an echo action that reads a character, prints it and returns the character read, might look like this:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;pre&gt;&lt;br /&gt;echo :: IO Char&lt;br /&gt;echo = hGetChar stdin    &gt;&gt;= \c →&lt;br /&gt;   hPutChar stdout        &gt;&gt;&lt;br /&gt;   return c&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;So, the resulting program which can be compiled to do something might look like&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;pre&gt;&lt;br /&gt;main :: IO ()&lt;br /&gt;main = echo   &gt;&gt;= \c -&gt;&lt;br /&gt;if c == '\n'&lt;br /&gt;then return ()&lt;br /&gt;else main&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;In principle, then, a program is just a state transformer that is applied to the real world to give a new world. In practice, however, it is crucial that the side-effects the program specifies are performed incrementally, and not all at once when the program finishes.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;&lt;span style="font-weight: bold;"&gt;2. Processes in Haskell&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Concurrent Haskell provides a new primitive &lt;span style="font-style: italic;"&gt;forkIO&lt;/span&gt;, which starts a concurrent process:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;forkIO :: IO () -&gt; IO ()&lt;br /&gt;   &lt;/code&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;forkIO&lt;/span&gt; is an action which takes an action, a, as its argument and spawns a concurrent process to perform that action. The I/O and other side effects performed by a are interleaved in an unspecified fashion with those that follow the &lt;span style="font-style: italic;"&gt;forkIO&lt;/span&gt;. Here's an example:&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;pre&gt;&lt;br /&gt;let&lt;br /&gt;-- loop ch prints an infinite sequence of ch's&lt;br /&gt;loop ch = hPutChar stdout ch &gt;&gt; loop ch&lt;br /&gt;in&lt;br /&gt;forkIO (loop 'a')         &gt;&gt;&lt;br /&gt;loop 'z'&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;The &lt;span style="font-style: italic;"&gt;forkIO&lt;/span&gt; spawns a process which perform the action loop 'a'. Meanwhile, the “parent“  process continues on to perform loop 'z'. The result is than an infinite sequence of interleaved 'a's and 'z's appears on the screen. The exact interleaving is unspecified.&lt;br /&gt;As a more realistic example of &lt;span style="font-style: italic;"&gt;forkIO&lt;/span&gt; in action a mail tool night incorporate the following loop:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;pre&gt;&lt;br /&gt;mailLoop&lt;br /&gt;= getButtonPress b &gt;&gt;= \ v -&gt;&lt;br /&gt;case v of&lt;br /&gt;Compose -&gt; forkIO doCompose &gt;&gt;&lt;br /&gt;                         mailLoop&lt;br /&gt;...other things&lt;br /&gt;doCompose :: IO ()        -- Pop up and manage&lt;br /&gt;doCompose = ...           -- composition window&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Here, &lt;span style="font-style: italic;"&gt;getButtonPress&lt;/span&gt; is very like &lt;span style="font-style: italic;"&gt;hGetChar&lt;/span&gt;; it awaits the next button press on button b, and then delivers a value indicating which button was pressed. This value is then treated by the case expression. If its value is &lt;span style="font-style: italic;"&gt;Compose&lt;/span&gt;, then the action &lt;span style="font-style: italic;"&gt;doCompose&lt;/span&gt; is forked to handle an independent composition window, while the main process continues with the next &lt;span style="font-style: italic;"&gt;getButtonPress&lt;/span&gt;.&lt;br /&gt;There are some interesting questions related to concurrency in Haskell.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;1.&lt;/span&gt; Let's imagine that we have to evaluate value named 'c'. In common Haskell this value is represented internally by pointer to some closure which will be called and evaluate this value when someone will need in it. That's the famous Haskell's laziness.&lt;br /&gt;Now each of concurrent processes may need in this value. Then the first who provoked 'c''s evaluation replaces pointer to 'c''s closure to some temporary object named thunk. Thunk indicates that value 'c' is currently evaluating. Other processes wait until evaluation ends.&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;&lt;br /&gt;2. &lt;/span&gt;Since the parent and child processes may both mutate (parts of) the same shared state (namely, the world),  &lt;span style="font-style: italic;"&gt;forkIO&lt;/span&gt; immediately introduces non-determinism. For  example, if one process decides to read a file, and the other deletes it, the effect of running the program will  be unpredictable. While this non-determinism is not desirable, it is not avoidable; indeed, every concurrent language is non-deterministic. The only way to enforce determinism would be by somehow constraining the two processes to work on separate parts of the state  (different files, in our example). The trouble is that essentially all the interesting applications of concurrency involve the deliberate and controlled mutation of shared state, such as screen real estate, the file system, or the internal data structures of the program. The right solution, therefore, is to provide mechanisms which allow  (though alas they cannot enforce) the safe mutation of  shared state  section.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;3. &lt;/span&gt;&lt;span style="font-style: italic;"&gt;forkIO&lt;/span&gt; is asymmetrical: when a process executes a &lt;span style="font-style: italic;"&gt;forkIO&lt;/span&gt;, it spawns a child process that executes concurrently with the continued execution of the parent. It would have been possible to design a symmetrical fork, an approach taken by Jones &amp;amp; Hudak:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;symFork :: IO a -&gt; IO b -&gt; IO (a,b)&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;The idea here is &lt;span style="font-style: italic;"&gt;symFork p1 p2&lt;/span&gt; is an action that forks  two processes, p1 and p2. When both complete, the symFork pairs their results together and returns this pair as its result. We rejected this approach because it forces us to synchronize on the termination of the forked process. If the desired behavior is that the forked process lives as long as it desires, then we have to provide the whole of the rest of the parent as the other argument to symFork, which is extremely inconvenient.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;&lt;span style="font-weight: bold;"&gt;3. Synchronization and communication&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;To make our processes interact with each other and organize synchronization between them we introduce spectial type&lt;br /&gt;&lt;code&gt;&lt;br /&gt;type MVar a&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;This is simple memory cell which can contain any value of type a or be empty. We define following primitive operations on &lt;span style="font-style: italic;"&gt;MVars&lt;/span&gt;:&lt;br /&gt;                   &lt;code&gt;&lt;br /&gt;newMVar :: IO (MVar a)&lt;/code&gt;              creates a new MVar.&lt;br /&gt;&lt;code&gt;&lt;br /&gt;takeMVar :: MVar a -&gt; IO a&lt;br /&gt;&lt;/code&gt; blocks until the location is non-empty, then reads and returns the value, leaving the location empty.&lt;br /&gt;&lt;code&gt;&lt;br /&gt;putMVar :: MVar a -&gt; a -&gt; IO ()&lt;/code&gt;     writes a value into the specified location. If there are one or more  processes blocked in &lt;span style="font-style: italic;"&gt;takeMVar&lt;/span&gt; on that location, one is thereby allowed to proceed. It is an error to perform &lt;span style="font-style: italic;"&gt;putMVar&lt;/span&gt; on a location which already contains a value.&lt;br /&gt;&lt;br /&gt;Notice that &lt;span style="font-style: italic;"&gt;MVar &lt;/span&gt;can be considered in different ways:&lt;br /&gt;as a channel for messages exchange between processes&lt;br /&gt;&lt;span style="font-style: italic;"&gt;MVar (&lt;/span&gt;) is simple semaphor, &lt;span style="font-style: italic;"&gt;putMVar&lt;/span&gt; denotes rising and &lt;span style="font-style: italic;"&gt;takeMVar&lt;/span&gt; is sinking&lt;br /&gt;&lt;br /&gt;With &lt;span style="font-style: italic;"&gt;MVar&lt;/span&gt; we now can solve simple problem of “Producer and Customer” in case when Producer produces faster than customer can take. For this we'll make buffered slot &lt;span style="font-style: italic;"&gt;CVar&lt;/span&gt; for Customer to take from.&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;pre&gt;&lt;br /&gt;type CVar a = (MVar a,                 -- Producer -&gt; consumer&lt;br /&gt;         MVar ())                                              -- Consumer -&gt; producer&lt;br /&gt;&lt;br /&gt;newCVar :: IO (CVar a)&lt;br /&gt;newCVar&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;  &lt;span style="color: rgb(0, 0, 153);"&gt;= newMVar                            &gt;&gt;= \ data_var -&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;newMVar                            &gt;&gt;= \ ack_var -&gt;&lt;br /&gt;putMVar ack_var (    )        &gt;&gt;&lt;br /&gt;return (data_var, ack_var)&lt;br /&gt;&lt;br /&gt;putCVar :: CVar a -&gt; a -&gt; IO ()&lt;br /&gt;putCVar (data_var,ack_var) val&lt;br /&gt;= takeMVar ack_var &gt;&gt;&lt;br /&gt; putMVar data_var val&lt;br /&gt;&lt;br /&gt;getCVar :: CVar a -&gt; IO a&lt;br /&gt;getCVar (data_var,ack_var)&lt;br /&gt;= takeMVar data_var      &gt;&gt;= \ val -&gt;&lt;br /&gt; putMVar ack_var ()     &gt;&gt;&lt;br /&gt; return val&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;&lt;span style="font-weight: bold;"&gt;4. Haskell Software transactional Memory (HSTM)&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Implementation of transactional memory in Haskell resembles IO abstraction. It introduces terms of special monadic type which represent atomic blocks. It also adds special mechanism to compose two transaction as alternatives. Main characteristics of HSTM look like this:&lt;br /&gt;&lt;br /&gt;&lt;table style="color: rgb(0, 0, 0);" border="1"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Strong or Weak Isolation&lt;/td&gt;         &lt;td&gt;Strong&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Transaction Granularity&lt;/td&gt;          &lt;td&gt;Word&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Direct or Deferred Update&lt;/td&gt;        &lt;td&gt;Deferred (cloned replacement)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Concurrency Control&lt;/td&gt;              &lt;td&gt;Optimistic&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Synchronization&lt;/td&gt;                  &lt;td&gt;Blocking&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Conflict Detection&lt;/td&gt;               &lt;td&gt;Late&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Inconsistent Reads&lt;/td&gt;               &lt;td&gt;Inconsistency toleration&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Conflict Resolution&lt;/td&gt;              &lt;td&gt;None&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Nested Transaction&lt;/td&gt;               &lt;td&gt;None (not allowed by type system)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Exceptions&lt;/td&gt;                       &lt;td&gt;Abort&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;So, we add new type of terms into language – &lt;span style="font-style: italic;"&gt;STM&lt;/span&gt; monad. It has semantics similar to IO monad but it  “marks” terms which will be treated as atomic blocks. To wrap them into habitual atomic block we use function atomic.&lt;br /&gt;&lt;code&gt;&lt;br /&gt;atomic :: STM a → STM a&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;By analogy with &lt;span style="font-style: italic;"&gt;MVar&lt;/span&gt; type in plain concurrent Haskell we introduce also &lt;span style="font-style: italic;"&gt;TVar&lt;/span&gt; type (transactional) to represent values stable against transactional operations.&lt;br /&gt;&lt;code&gt;&lt;br /&gt;type TVar a&lt;br /&gt;readTVar :: TVar a → STM a&lt;br /&gt;writeTVar :: TVar a → a → STM()&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;For example let's consider such variable containing some integer values and operations on it.&lt;br /&gt;&lt;code&gt;&lt;br /&gt;type Resource = TVar Int&lt;br /&gt;putR :: Resource -&gt; Int -&gt; STM ()&lt;br /&gt;putR r i = do { v &lt;- readTVar r; writeTVar r (v+i) } &lt;/code&gt;&lt;br /&gt;The atomic function transactionally committed (or aborted) these updates:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;main = do { ...; atomic (putR r 3); ... }&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;HSTM also introduced an explicit &lt;span style="font-style: italic;"&gt;retry&lt;/span&gt; statement as a coordination mechanism between transactions. The &lt;span style="font-style: italic;"&gt;retry&lt;/span&gt; statement aborts the current transaction and prevents it from  reexecuting until at least one of the &lt;span style="font-style: italic;"&gt;TVars&lt;/span&gt; accessed by the transaction changes value. For  example,&lt;br /&gt;&lt;code&gt;&lt;pre&gt;&lt;br /&gt;getR :: Resource → Int → STM ()&lt;br /&gt;getR r i = do { v &lt;- readTVar r&lt;br /&gt;                  ; if (v &lt; i) then retry&lt;br /&gt;                     else writeTVar r (v-i) }&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;atomically extracts i units from a &lt;span style="font-style: italic;"&gt;Resource&lt;/span&gt;. It uses a retry statement to abort an enclosing transaction if the &lt;span style="font-style: italic;"&gt;Resource&lt;/span&gt; does not contain enough units. If this function executes retry, r is the only &lt;span style="font-style: italic;"&gt;TVar&lt;/span&gt; read, so the transaction re-executes when r changes value.&lt;br /&gt;HSTM also introduced the binary &lt;span style="font-style: italic;"&gt;orElse&lt;/span&gt; operator for composing two transactions. This operator first starts its left-hand transaction. If this transaction commits, the &lt;span style="font-style: italic;"&gt;orElse&lt;/span&gt; operator finishes. However, if this transaction retries, the operator tries the right-hand transaction instead. If this one commits, the &lt;span style="font-style: italic;"&gt;orElse&lt;/span&gt; operator finishes. If it retries, the entire &lt;span style="font-style: italic;"&gt;orElse&lt;/span&gt; statement waits for changes in the set of &lt;span style="font-style: italic;"&gt;TVars&lt;/span&gt; read by both transactions before retrying. For example, this operator turns &lt;span style="font-style: italic;"&gt;getR&lt;/span&gt; into an operation that returns a true/false success/failure result:&lt;br /&gt;&lt;code&gt;&lt;pre&gt;&lt;br /&gt;nonBlockGetR :: Resource -&gt; Int -&gt;STM Bool&lt;br /&gt;nonBlockGetR r i = do { getR r i ; return True }‘orElse‘ return false&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;Notice, that retry operator “retries” largest enclosing term which has STM type.&lt;br /&gt;Some words about implementation. Al transaction treads and writes to &lt;span style="font-style: italic;"&gt;TVars&lt;/span&gt; deal with special transactional log which hides these variables references form other transactions. When the transaction commits, it first validates its log entries, to ensure that no other transaction modified the &lt;span style="font-style: italic;"&gt;TVars&lt;/span&gt; values. If valid, the transaction installs the new values in these variables. If validation fails, the log is discarded and the transaction re-executed.&lt;br /&gt;If a transaction invokes retry, the transaction is validated (to avoid retries caused by inconsistent execution) and the log discarded after recording all &lt;span style="font-style: italic;"&gt;TVars&lt;/span&gt; read by the transaction. The system binds the transaction’s thread to each of these variables. When a transaction updates one of these variables, it also restarts the thread, which re-executes the transaction.&lt;br /&gt;The &lt;span style="font-style: italic;"&gt;orElse&lt;/span&gt; statement requires a closed nested transaction to surround each of the two alternatives, so that either one can abort without terminating the surrounding transaction. If either transaction completes successfully, its log is merged with the surrounding transaction’s log, which can commit. If either or both transactions invoke retry, the outer transaction waits on the union of the &lt;span style="font-style: italic;"&gt;TVars&lt;/span&gt; read by the transactions that retried.&lt;br /&gt;&lt;span style="font-size:130%;"&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;5. Garbage Collection in Concurrent Haskell&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;At the end it would be good to tell some words about garbage collection in Concurrent Haskell. It's interesting problem to collect processes which become “useless”. There is obvious strategy to do it if we ensure, that process we want to collect will not have “side effects” further. We can formulate two rules to do it:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Running process cannot be collected&lt;/li&gt;&lt;li&gt;We can collect process which holds some MVar if this variable is now inaccessible for any other non-garbage process.&lt;/li&gt;&lt;/ol&gt;At last classic “mark-and-sweep” tracing procedure can be implemented on processes:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;When tracing accessible heap objects, treat all runnable processes as roots.&lt;/li&gt;&lt;li&gt;When some MVar is identified as reachable, identify all processes blocked by it as reachable ones..&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;&lt;span style="font-weight: bold;"&gt;List of used papers:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Simon Peyton Jones, Andrew Gordon, Sigbjorn Finne &lt;span style="font-style: italic;"&gt;“Concurrent Haskell”&lt;/span&gt;&lt;/li&gt;&lt;li&gt;Paul Hudak, Simon Peyton Jones, Philip Wadler, Brian Boutel, Jon Fairbairn, Joseph Fasel, María M. Guzmán, Kevin Hammond, John Hughes, Thomas Johnsson, Dick Kieburtz, Rishiyur Nikhil, Will Partain, John Peterson &lt;span style="font-style: italic;"&gt;“Report on the programming language Haskell: a non-strict, purely functional language version 1.2”&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;&lt;p style="margin-bottom: 0in;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4500206154953507248-170510689153892286?l=tm-101.blogspot.com'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tm-101.blogspot.com/feeds/170510689153892286/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=4500206154953507248&amp;postID=170510689153892286' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4500206154953507248/posts/default/170510689153892286'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4500206154953507248/posts/default/170510689153892286'/><link rel='alternate' type='text/html' href='http://tm-101.blogspot.com/2008/05/reading-8-concurrent-haskell-and-hstm.html' title='Reading 8. Concurrent Haskell and HSTM'/><author><name>Ilya</name><uri>http://www.blogger.com/profile/09516686132153763780</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='15541609664562111634'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4500206154953507248.post-5316449193361920839</id><published>2008-05-28T19:50:00.005+04:00</published><updated>2008-10-27T19:15:54.655+03:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='readings'/><title type='text'>Reading 7. HTM. Multiple atomic read-write operations</title><content type='html'>&lt;span style="font-size:130%;"&gt;&lt;span style="font-weight: bold;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-weight: bold;font-size:130%;" &gt;Stone et al., IEEE Concurrency 1993: Multiple atomic read-write operations ("&lt;span style="font-style: italic;"&gt;Oklahoma Update&lt;/span&gt;")&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Let's extend compare&amp;amp;swap command set to support multiple memory locations.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;read-and-reserve&lt;/span&gt; - reads a memory location into a speciﬁed general-purpose register, places a reservation on the location’s address in the reservation register, and clears the reservation register’s data ﬁeld.&lt;/li&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;store-contingent&lt;/span&gt; - locally updates the reservation register’s data ﬁeld without obtaining write permissions.&lt;/li&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;write-if-reserved&lt;/span&gt; - speciﬁes a set of reservation registers and updates the memory locations reserved by those registers. It is used to initiate the commit process. It attempts to obtain exclusive ownership for each of the addresses in the reservation registers. If the reservations remain valid during this process, the instruction updates memory with the modiﬁed data from the reservation registers. The instruction returns an indication whether the update succeeded or not.&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;Example&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Having such commands we can write i.e. synchronized queue:&lt;br /&gt;&lt;pre&gt;void Enqueue(newpointer) {&lt;br /&gt; Memory[newpointer].next = NULL;&lt;br /&gt; status = 0;&lt;br /&gt;&lt;br /&gt; while (!status) {&lt;br /&gt;   last_pointer = Read_and_Reserve(Memory[tail].next, reservation1);&lt;br /&gt;   if (last_pointer == NULL) {&lt;br /&gt;     // this is an empty queue&lt;br /&gt;     first_pointer = Read_and_Reserve(Memory[head].next, reservation2);&lt;br /&gt;     Store_Contingent(newpointer, reservation1);&lt;br /&gt;     Store_Contingent(newpointer, reservation2);&lt;br /&gt;     status = Write_If_Reserved(reservation1, reservation2);&lt;br /&gt;   } else {&lt;br /&gt;     // non-empty queue&lt;br /&gt;     temp_pointer = Read_and_Reserve(Memory[last_pointer].next, reservation2);&lt;br /&gt;     Store_Contingent(newpointer, reservation1);&lt;br /&gt;     Store_Contingent(newpointer, reservation2);&lt;br /&gt;     status = Write_If_Reserved(reservation1, reservation2);&lt;br /&gt;   }&lt;br /&gt; } // repeat until successful&lt;br /&gt;&lt;br /&gt; return;&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;Implementation&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Compared to compare&amp;amp;swap decomposition, hardware implementation for Oklahoma Update is much more complex.&lt;br /&gt;Instead of one &lt;span style="font-style: italic;"&gt;xa-address&lt;/span&gt; register this implementation requires number of reservation registers per CPU to store memory locations and various flags. Those are places&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;read-and-reserve &lt;/span&gt;&lt;span&gt;and&lt;/span&gt; &lt;span style="font-weight: bold;"&gt;store-contingent&lt;/span&gt;&lt;span&gt; writes to&lt;/span&gt;. &lt;span&gt;And &lt;/span&gt;&lt;span style="font-weight: bold;"&gt;write-if-reserved&lt;/span&gt; can be called commit operation since it commits all local writes to shared memory.&lt;br /&gt;&lt;br /&gt;Basically, commit operation consists of two phases:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-style: italic;"&gt;Requesting write permissions&lt;/span&gt;. CPU ensures having exclusive write lock to every location from reservation registers. To avoid deadlocking to commit operations from other processors, CPU acquires locks in address-ascending manner.&lt;br /&gt;&lt;br /&gt;If it can't obtain every one lock entire conflict resolution takes place. CPU may try to restart the process after some time or abort the operation.&lt;/li&gt;&lt;li&gt;&lt;span style="font-style: italic;"&gt;Commiting data values&lt;/span&gt;. As soon as processor has locks it starts uninterruptable operation to write data.&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;Interactions between the new instructions and regular store operations introduce forward-progress concerns. Regular stores do not participate in the new instructions’ conflict resolution mechanism. If a regular store from one processor conflicted with an address specified in a reservation register of another processor, this processor would abort its update. See more on this in the original paper.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;Classification&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;table&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Strong or weak isolation&lt;/td&gt;&lt;td&gt;Strong&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Transaction granularity&lt;/td&gt;&lt;td&gt;Cache line&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Direct or deferred update&lt;/td&gt;&lt;td&gt;Deferred (in reservation registers)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Concurrency control&lt;/td&gt;&lt;td&gt;Optimistic. Commit initiates acquiring ownership&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Conflict detection&lt;/td&gt;&lt;td&gt;Late write-write conflict (if not a regular store)&lt;br /&gt;Late write-read conflict (if not a regular store)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Inconsistent reads&lt;/td&gt;&lt;td&gt;None&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Conflict resolution&lt;/td&gt;&lt;td&gt;Address-based rwo-phase commit&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Nested transaction&lt;/td&gt;&lt;td&gt;N/A&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;Sources&lt;/span&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://www.morganclaypool.com/doi/abs/10.2200/S00070ED1V01Y200611CAC002"&gt;J.R. Larus, R.Rajwar &lt;/a&gt;&lt;span style="font-style: italic;"&gt;&lt;a href="http://www.morganclaypool.com/doi/abs/10.2200/S00070ED1V01Y200611CAC002"&gt;Transactional Memory&lt;/a&gt;&lt;/span&gt;&lt;span&gt;, 4.3.4&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-style: italic;"&gt;J. M. Stone, H. S. Stone, P. Heidelberger and J. Turek, “Multiple reservations and the Oklahoma update,” IEEE Concurrency, Vol. 1(4), pp. 58–71 Nov. 1993.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4500206154953507248-5316449193361920839?l=tm-101.blogspot.com'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tm-101.blogspot.com/feeds/5316449193361920839/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=4500206154953507248&amp;postID=5316449193361920839' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4500206154953507248/posts/default/5316449193361920839'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4500206154953507248/posts/default/5316449193361920839'/><link rel='alternate' type='text/html' href='http://tm-101.blogspot.com/2008/05/reading-7-htm-precursors.html' title='Reading 7. HTM. Multiple atomic read-write operations'/><author><name>Leonid Shalupov</name><uri>http://www.blogger.com/profile/13920834345027590432</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='04513419942137225811'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4500206154953507248.post-4966293863259592347</id><published>2008-05-28T19:20:00.005+04:00</published><updated>2008-10-27T19:15:00.668+03:00</updated><title type='text'>Reading 7. HTM. Compare&amp;Swap decomposition</title><content type='html'>&lt;span style="font-size:130%;"&gt;&lt;span style="font-weight: bold;font-size:130%;" &gt;Jensen,Hagensen, and Broughton, UCRL 1987: support for optimistic synchronization using single memory location.&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;Complex &lt;span style="font-style: italic;"&gt;compare&amp;amp;swap&lt;/span&gt; instruction can be splitted into simplier parts:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-weight: bold;font-size:100%;" &gt;sync load&lt;/span&gt;. One memory location are loaded into special register (called &lt;span style="font-style: italic;"&gt;xa-address&lt;/span&gt;), that indicates the processor requires exclusive access to this address, and loads the accessed data into a general register.&lt;/li&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;sync store&lt;/span&gt;. Stores data to previously saved address in &lt;span style="font-style: italic;"&gt;xa-address&lt;/span&gt; if no other processor stored anything there (i.e. no conflict occurred) or conflict resolution decided to allow write by this processor.&lt;/li&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;sync clear&lt;/span&gt;. Clears &lt;span style="font-style: italic;"&gt;xa-address&lt;/span&gt;.&lt;/li&gt;&lt;/ul&gt;&lt;span style="color: rgb(51, 51, 255);"&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;A simple locking can be implemented in those instructions as follows:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;blockquote style="color: rgb(102, 102, 204);"&gt;&lt;br /&gt;&lt;pre&gt;// if (lock == 0) { lock = ProcessID; } % atomically&lt;br /&gt;// else goto LockHeld...                % lock was held&lt;br /&gt;Retry:&lt;br /&gt;sync_load R10, lock           ; declare exclusive intent&lt;br /&gt;jump_q .neq (R10,0), LockHeld ; test for zero&lt;br /&gt;sync_clear                    ; lock non-zero, hence abort&lt;br /&gt;load R10, ProcessID           ; prepare to update lock&lt;br /&gt;sync_store R10, lock          ; update lock if not aborted&lt;br /&gt;goto Retry                    ; try the update again&lt;br /&gt;MyLock:                         ; got the lock&lt;br /&gt;&lt;/pre&gt;&lt;/blockquote&gt;&lt;span style="font-weight: bold;font-size:100%;" &gt;&lt;span style="font-style: italic;"&gt;Implementation&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;sync_store&lt;/span&gt; instruction broadcasts write operation on all processors thus detecting conflicts with other processors having same &lt;span style="font-style: italic;"&gt;xa-address&lt;/span&gt;. If such conflict detected, one processor clears own &lt;span style="font-style: italic;"&gt;xa-address&lt;/span&gt; or aborts store operation (depends on conflict resolution scheme).&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;Real-world usage&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;MIPS, Alpha, PowerPC implemented variations to this scheme&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;Classification&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;table&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Strong or weak isolation&lt;/td&gt;&lt;td&gt;&lt;br /&gt;&lt;/td&gt;&lt;td&gt;Strong&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Transaction granularity&lt;/td&gt;&lt;td&gt;&lt;br /&gt;&lt;/td&gt;&lt;td&gt;Cache line&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Direct or deferred update&lt;/td&gt;&lt;td&gt;&lt;br /&gt;&lt;/td&gt;&lt;td&gt;Conditional direct store (single word)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Concurrency control&lt;/td&gt;&lt;td&gt;&lt;br /&gt;&lt;/td&gt;&lt;td&gt;Optimistic (single word)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Conflict detection&lt;/td&gt;&lt;td&gt;&lt;br /&gt;&lt;/td&gt;&lt;td&gt;N/A&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Inconsistent reads&lt;/td&gt;&lt;td&gt;&lt;br /&gt;&lt;/td&gt;&lt;td&gt;None&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Conflict resolution&lt;/td&gt;&lt;td&gt;&lt;br /&gt;&lt;/td&gt;&lt;td&gt;Processor id or first to request ownership&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Nested transaction&lt;/td&gt;&lt;td&gt;&lt;br /&gt;&lt;/td&gt;&lt;td&gt;N/A&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;Sources&lt;/span&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://www.morganclaypool.com/doi/abs/10.2200/S00070ED1V01Y200611CAC002"&gt;J.R. Larus, R.Rajwar &lt;/a&gt;&lt;span style="font-style: italic;"&gt;&lt;a href="http://www.morganclaypool.com/doi/abs/10.2200/S00070ED1V01Y200611CAC002"&gt;Transactional Memory&lt;/a&gt;&lt;/span&gt;&lt;span&gt;, 4.3.3&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-style: italic;"&gt;E. H. Jensen, G. W. Hagensen and J. M. Broughton, “A new approach to exclusive data access in shared memory multiprocessors,” Lawrence Livermore National Laboratory, Technical Report UCRL-97663, Nov. 1987.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4500206154953507248-4966293863259592347?l=tm-101.blogspot.com'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tm-101.blogspot.com/feeds/4966293863259592347/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=4500206154953507248&amp;postID=4966293863259592347' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4500206154953507248/posts/default/4966293863259592347'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4500206154953507248/posts/default/4966293863259592347'/><link rel='alternate' type='text/html' href='http://tm-101.blogspot.com/2008/05/reading-7-htm-compare-decomposition.html' title='Reading 7. HTM. Compare&amp;Swap decomposition'/><author><name>Leonid Shalupov</name><uri>http://www.blogger.com/profile/13920834345027590432</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='04513419942137225811'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4500206154953507248.post-8619029611756172058</id><published>2008-05-28T19:06:00.007+04:00</published><updated>2008-10-27T19:12:44.253+03:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='readings'/><title type='text'>Reading 7. HTM. A side story</title><content type='html'>&lt;span style="font-size:130%;"&gt;&lt;span style="font-weight: bold;"&gt;Knight 1986, paralleling a single-threaded programs&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;A compiler divides a program into a series of code blocks called transactions. For doing the division, the compiler assumes that these transactions do not have memory dependencies. These blocks then execute optimistically on the processors. The hardware enforces correct execution and uses caches to detect when a memory dependence violation between threads occurs.&lt;br /&gt;&lt;br /&gt;This is the ﬁrst paper, that proposed to use caches and cache coherence to maintain ordering among speculatively parallelized regions of a sequential code in the presence of unknown memory dependencies. While the paper did not directly address explicitly parallel programming, it set the groundwork for using caches and coherence protocols for future transactional memory proposals.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic; font-weight: bold;"&gt;Implementation&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Firstly, the compiler divides a program into sequence of "mostly independent" series of blocks (&lt;span style="font-style: italic;"&gt;transactions&lt;/span&gt;). Than those blocks runs on shared-memory multiprocessors with own register state. All write memory operations cached in &lt;span style="font-style: italic;"&gt;confirm cache&lt;/span&gt;. Every processor also stores all read operations in special &lt;span style="font-style: italic;"&gt;dependency cache&lt;/span&gt;. Also write operations from other processors detected and used to update &lt;span style="font-style: italic;"&gt;dependency cache&lt;/span&gt; (so called, &lt;span style="font-style: italic;"&gt;bus sneaking&lt;/span&gt;).&lt;br /&gt;&lt;br /&gt;All transactions are committed one-by-one as prescribed by original single-threaded program. Stored dependencies used to detect write conflicts. If one occurs, failed transaction simply runs again.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;&lt;span style="font-style:italic;"&gt;Classification&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;Strong or weak isolation&lt;td&gt; &lt;td&gt;Strong&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Transaction granularity&lt;td&gt; &lt;td&gt;Cache line&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Direct or deferred update&lt;td&gt; &lt;td&gt;Deferred (in cache)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Concurrency control&lt;td&gt; &lt;td&gt;Optimistic. Commit serialized globally&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Conflict detection&lt;td&gt; &lt;td&gt;Late write-write conflict&lt;br&gt;Late write-read conflict&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Inconsistent reads&lt;td&gt; &lt;td&gt;No&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Conflict resolution&lt;td&gt; &lt;td&gt;Program order (sequential program)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Nested transaction&lt;td&gt; &lt;td&gt;N/A&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;Sources&lt;/span&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://www.morganclaypool.com/doi/abs/10.2200/S00070ED1V01Y200611CAC002"&gt;J.R. Larus, R.Rajwar &lt;/a&gt;&lt;span style="font-style: italic;"&gt;&lt;a href="http://www.morganclaypool.com/doi/abs/10.2200/S00070ED1V01Y200611CAC002"&gt;Transactional Memory&lt;/a&gt;&lt;/span&gt;&lt;span&gt;, 4.3.2&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-style: italic;"&gt;T. F. Knight, “An architecture for mostly functional languages,” In Proc. ACM Lisp and Functional Programming Conference, pp. 105–112 Aug. 1986&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4500206154953507248-8619029611756172058?l=tm-101.blogspot.com'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tm-101.blogspot.com/feeds/8619029611756172058/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=4500206154953507248&amp;postID=8619029611756172058' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4500206154953507248/posts/default/8619029611756172058'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4500206154953507248/posts/default/8619029611756172058'/><link rel='alternate' type='text/html' href='http://tm-101.blogspot.com/2008/05/reading-7-htm-side-story.html' title='Reading 7. HTM. A side story'/><author><name>Leonid Shalupov</name><uri>http://www.blogger.com/profile/13920834345027590432</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='04513419942137225811'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4500206154953507248.post-2544774473799522198</id><published>2008-04-25T00:29:00.007+04:00</published><updated>2008-05-19T23:22:06.560+04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='readings'/><title type='text'>Reading 6. Lock Elision</title><content type='html'>&lt;span style="font-weight: bold;"&gt;&lt;span style="font-size:130%;"&gt;Why do we need it?&lt;/span&gt;&lt;br /&gt;&lt;span style="font-size:85%;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;div style="text-align: justify;"&gt;&lt;span style="font-size:100%;"&gt;&lt;span style="font-family:arial;"&gt;In multithreaded programs, synchronization mechanism - usually locks - are often used to guarantee threads have exclusive access to shared data for a critical section of code. A thread acquires the lock, executes its critical section, and release the lock. The key insight is that locks do not always have to be acquired for a correct execution. The following code demonstrates this fact:&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;span style="font-size:100%;"&gt;&lt;span style="font-family:arial;"&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 0, 0);font-family:georgia;" &gt;Thread 1&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;span style="color: rgb(51, 51, 255);font-family:georgia;" &gt;LOCK(hash_tbl.lock)&lt;br /&gt;&lt;/span&gt;&lt;span style="color: rgb(51, 51, 255);font-family:georgia;" &gt;var = hash_tbl.lookup(X)&lt;br /&gt;&lt;/span&gt;&lt;span style="color: rgb(51, 51, 255);font-family:georgia;" &gt;if (!var)&lt;br /&gt;&lt;/span&gt;&lt;span style="color: rgb(51, 51, 255);font-family:georgia;" &gt;    hash_tbl.add(X);&lt;br /&gt;&lt;/span&gt;&lt;span style="color: rgb(51, 51, 255);font-family:georgia;" &gt;UNLOCK(hash_tbl.lock)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 0, 0);font-family:georgia;" &gt;Thread 2&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="color: rgb(51, 51, 255);font-family:georgia;" &gt;LOCK(hash_tbl.lock)&lt;/span&gt;&lt;span style="color: rgb(51, 51, 255);font-family:georgia;" &gt;&lt;br /&gt;var = hash_tbl.lookup(Y)&lt;/span&gt;&lt;span style="color: rgb(51, 51, 255);font-family:georgia;" &gt;&lt;br /&gt;if (!var)&lt;/span&gt;&lt;span style="color: rgb(51, 51, 255);font-family:georgia;" &gt;&lt;br /&gt;hash_tbl.add(Y);&lt;/span&gt;&lt;span style="color: rgb(51, 51, 255);font-family:georgia;" &gt;&lt;br /&gt;UNLOCK(hash_tbl.lock)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;div style="text-align: justify;"&gt;&lt;span style="font-size:100%;"&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-family:georgia;"&gt;As we can see here this can be executed without any lock, because threads are updating different parts of the shared object.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:georgia;"&gt;While talking about the trade-offs of the multithreaded programming  one can point:&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;ol&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Conservative locking. To ensure correctness, programmers rely on conservative locking, often at the expense of performance.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Granularity. Careful program design must choose appropriate levels of locking to optimize the tradeoff between performance and ease of reasoning about program correctness.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Thread-unsafe libraries. If a thread calls a library not equipped to deal with threads, a global locking mechanism is used to prevent conflicts. And again the performance problem.&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;span style="font-size:100%;"&gt;The solution is obvious now:&lt;br /&gt;&lt;/span&gt;&lt;ol&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Programmers use frequent and conservative synchronization to write correct multithreaded programs.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Synchronization instructions are predicted as being unnecessary and elided.&lt;br /&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;There is no system-level modification, only hardware support needed&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;&lt;span style="font-size:85%;"&gt;&lt;span style="font-family:arial;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;span style="font-weight: bold;"&gt;&lt;span style="font-size:130%;"&gt;Speculative Lock Elision&lt;/span&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-size:100%;"&gt;How the lock is constructed? Let's look at typical code for the lock-acquire and release using the Alpha ISA:&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(51, 51, 255);"&gt;Ll:  l. ldl tO, O(tl)                                   # tO = lock&lt;br /&gt;&lt;/span&gt;&lt;span style="color: rgb(51, 51, 255);"&gt;      2. bne tO, LI:                                   #if not free, goto L1&lt;br /&gt;&lt;/span&gt;&lt;span style="color: rgb(51, 51, 255);"&gt;      3. ldl_l tO, O(tl)                         #load locked, tO = lock&lt;br /&gt;&lt;/span&gt;&lt;span style="color: rgb(51, 51, 255);"&gt;      4. bne tO, LI:                                   #if not free, goto L1&lt;br /&gt;&lt;/span&gt;&lt;span style="color: rgb(51, 51, 255);"&gt;      5. lda tO, 1(O)                                 #tO = 1&lt;br /&gt;&lt;/span&gt;&lt;span style="color: rgb(51, 51, 255);"&gt;      6. stl_c tO, O(tl)                         #conditional store, lock = 1&lt;br /&gt;&lt;/span&gt;&lt;span style="color: rgb(51, 51, 255);"&gt;      7. beq tO, Li:                                     #if stlc failed, goto Li,&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;span style="color: rgb(51, 51, 255);"&gt;      8-15. &lt;critical section=""&gt;&lt;critical&gt;&lt;br /&gt;&lt;br /&gt;&lt;/critical&gt;&lt;/critical&gt;&lt;/span&gt;&lt;span style="color: rgb(51, 51, 255);"&gt;      16. stl 0,0(tl)                                       #lock = O, release lock&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;ldl_l/stl_c is the pair of so called LOAD-LOCKED/STORE-CONDITIONAL (ll/sc) instructions. &lt;span style="font-style: italic;"&gt;ll&lt;/span&gt; reads the data and &lt;span style="font-style: italic;"&gt;sc&lt;/span&gt; writes, if from the moment of reading the location nobody writes to it.&lt;br /&gt;&lt;br /&gt;As it was said before the key observation is that a lock does not always have to be acquired for a correct execution if hardware can provide the appearance of atomicity for all memory operations within the critical section. If a data conflict occurs, i.e., two threads compete for the same data other than for reading, atomicity cannot be guaranteed and the lock need to be acquired. Data conflicts among threads are detected using existing cache coherence protocol.&lt;br /&gt;&lt;br /&gt;To understand how point the critical section lets just look at the listing of the instructions above. The lock elision can be done by simply observing load and store sequences and the values read and to be written. We should use a filter to determine candidate load/store pairs. For example, in this implementation, only instructions tdl_t and stl c (they normally occur as a pair) are consider.&lt;br /&gt;&lt;br /&gt;So the algorithm is the following:&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;ol style="color: rgb(51, 51, 255);"&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;If candidate load (ldl_l) to an address is followed by store (stl_c of the lock acquire) to same address, predict another store (lock release) will shortly follow, restoring the memory location value to the one prior to this store (stl_c of the lock acquire).&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Predict memory operations in critical sections will occur atomically, and elide lock acquire.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Execute critical section speculatively and buffer results.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;If hardware cannot provide atomicity, trigger misspeculation, recover and explicitly acquire lock.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;If second store (lock release) of step 1 seen, atomicity was not violated (else a misspeculation would have been triggered earlier). Elide lock-release store, commit state, and exit speculative critical section.&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;span style="font-size:85%;"&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-size:100%;" &gt;Some words about implementing SLE&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;span style="font-size:100%;"&gt;&lt;span style="font-family:arial;"&gt;To recover from an SLE misspeculation, register and memory state must be buffered until SLE is validated.&lt;br /&gt;Two techniques for handling register state:&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;ol&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;&lt;span style="font-style: italic;"&gt;Reorder buffer (ROB)&lt;/span&gt;: each instruction is stored in buffer in the execution order, also there is a place in buffer for the instruction to store the result. Pay attention that the size of the ROB places a limit on the size of the critial section (for more details of the ROB look - J.E.Smith &amp;amp; A.R.Pleszkun. Implementation of Precise Interrupts in Pipeline processors)&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;&lt;span style="font-style: italic;"&gt;Register checkpoint&lt;/span&gt;: the idea is that a copy is done at the moment we elide the lock. This may be of dependence maps or of the architected register state itself. Instructions can be safely update the register file, speculatively retired, and be removed from the ROB because a correct architected register checkpoint exists for the recovery.&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;span style="font-size:100%;"&gt;To store the memory state:&lt;br /&gt;Although most modern processors support speculative load execution, they do not retire stores speculatively (i.e., write to the memory system speculatively). For supporting SLE, we augment existing processor write-buffers (between the processor and L1 cache) to buffer speculative memory updates. Speculative data is not committed from the write-buffer into the lower memory hierarchy until the lock elision is validated. On a&lt;br /&gt;misspeculation, speculative entries in the write-buffer are invalidated.&lt;br /&gt;&lt;br /&gt;To detects the conflicts the cache coherence protocol is used. Invalidation-based coherence protocol guarantee an exclusive copy of the memory block in the local cache when a store is performed. But the mechanism to record memory addresses read and write within the critical section is needed:&lt;br /&gt;&lt;/span&gt;&lt;ol&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;If the ROB approach is used for SLE, no additional mechanisms are required for tracking external writes to memory locations speculatively read--the LSQ is already snooped.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;If the register checkpoint approach is used, the LSQ alone cannot be used to detect load violations for SLE because loads may speculatively retire and leave the ROB. In this case, the cache can be augmented with an access bit for each cache block. Every memory access executed during SLE marks the access bit of the corresponding block. On an external request, this bit is checked in parallel with the cache tags. Any external invalidation to a block with its access bit set, or an external request to a block in exclusive state with its access bit set, triggers a misspeculation. The bit can be stored with the tags and there is no overhead for the bit comparison because, for maintaining coherency, a tag lookup is already performed to snoop the cache.&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;span style="font-size:85%;"&gt;&lt;span style="font-weight: bold;font-size:100%;" &gt;Resource constraints or when we should use locks&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:100%;"&gt;Limited resources may force a misspeculation if either there is not enough buffer space to store speculative updates, or it is not possible to monitor accessed data to provide atomicity guarantees. Four such conditions for misspeculation are:&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;ol&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Finite cache size. If register checkpoint is used, the cache may not be able to track all memory accesses.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Finite write-buffer size. The number of unique cache lines modified exceeds the write-buffer size.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Finite ROB size. If the checkpoint approach is used, the ROB size is not a problem.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Uncached accesses or events (e.g., some system calls) where the processor cannot track requests.&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;span style="font-size:85%;"&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-size:130%;" &gt;Transactional Lock Removal&lt;/span&gt;&lt;span style="font-size:130%;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-size:100%;"&gt;The main idea is to use SLE for lock-free execution but to use timestamp-based fair conflict resolution instead of  locks (though some stituations - like recources constraints - still need locks).&lt;br /&gt;The algorithm is based on Lamport clocks  (L.Lamport. Time, clocks, and the ordering of events in a distributed system).  Timestamps are used for resolving conflicts to decide a conflict winner - earlier timestamp implies higher priority.&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-size:100%;" &gt;The algorithm&lt;br /&gt;&lt;/span&gt;&lt;/span&gt; &lt;ol style="color: rgb(51, 51, 255);"&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Calculate local timestamp&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Identify transaction start:&lt;/span&gt;&lt;/li&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Initiate TLR mode (use SLE to elide locks)&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Execute transaction speculatively.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;During transactional speculative execution&lt;/span&gt;&lt;/li&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Locally buffer speculative updates&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Append timestamp to all outgoing requests&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;If incoming request conflicts with retainable block and has later timestamp, retain ownership and force requestor to wait&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;If incoming request conflicts with retainable block and has earlier timestamp, service request and restart from step 2b if necessary. Give up any retained ownerships.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;If insufficient resources, acquire lock.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;ul&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;No buffer space&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Operation cannot be undone (e.g., I/O)&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;/ul&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Identify transaction end:&lt;/span&gt;&lt;/li&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;If all blocks available in local cache in appropriate coherence state, atomically commit memory updates &lt;/span&gt;&lt;span style="font-size:100%;"&gt;from local buffer into cache (write to cache using SLE).&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Commit transaction register (processor) state.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Service waiters if any&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Update local timestamp&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;/ol&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-size:100%;" &gt;Propagating priority information&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-size:100%;"&gt;Let us consider now the following situation:&lt;br /&gt;&lt;/span&gt;&lt;ol&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;There are 3 processors - P0, P1, P2&lt;br /&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;They have priorities: P0&gt;P1&gt;P2&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;P0 has an exclusive ownership for a block A&lt;br /&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;P1 has an exclusive ownership for a block B&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;At time t1 P1 send a request for A, that goes to Po (because now he is the owner)&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;P0 receives the response, resolve conflict and so (P0&gt;P1) wins it and defers the response&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Now P1 exclusively owns block A (because P1's request has been ordered by the protocol) but data (and write permission) is with P0 still&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;P1 is waiting for P0 for cache block A&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;P2 sends a request for P1 for the block B&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Situation is just the same as above - P2 owns B exclusively, but write permission is still with P1&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;P2 is awaiting for P1&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;P0 sends a request for B and it's forwarded to P2 (that owns B, though does not have data itself)&lt;br /&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;P2 loses the conflict and should give block B to P0&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;But P2 is awaiting for P1 to release the block B and P1 is awaiting fot P0 to release the block A&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;deadlock!!!&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;span style="font-size:100%;"&gt;Now let's think over the situation. The deadlock happened because &lt;span style="font-style: italic;"&gt;the information about priorities was not propogated  along the cache coherence protocol&lt;/span&gt;! To solve the problem we need the following:&lt;br /&gt;&lt;/span&gt;&lt;ol&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Marker message  - directed message sent in responce to a request for a block under conflict for which data is not provided immediately.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Probes - are used to propagate a conflict request upstream in a cache coherence protocol chain. Thus, when P2 receives P0's request for B, P2 forwards the probe (with P0's timestamp) to P1 since P2 received a marker message from P1. PI receives P0's forwarded probe (via P2) and loses the conflict because P0 has higher priority than P1. P1 releases ownership of block B and the cyclic wait is broken.&lt;br /&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-size:85%;" &gt;&lt;span style="font-weight: bold;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4500206154953507248-2544774473799522198?l=tm-101.blogspot.com'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tm-101.blogspot.com/feeds/2544774473799522198/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=4500206154953507248&amp;postID=2544774473799522198' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4500206154953507248/posts/default/2544774473799522198'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4500206154953507248/posts/default/2544774473799522198'/><link rel='alternate' type='text/html' href='http://tm-101.blogspot.com/2008/04/reading-6-lock-elision.html' title='Reading 6. Lock Elision'/><author><name>Nastasia</name><uri>http://www.blogger.com/profile/13358122536478963041</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='10572757268373991225'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4500206154953507248.post-7039704485994362440</id><published>2008-03-27T23:04:00.004+03:00</published><updated>2008-04-11T02:33:46.687+04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='readings'/><title type='text'>Reading 5. Improvements on DSTM and WSTM</title><content type='html'>&lt;span style="font-weight: bold; color: rgb(0, 0, 0);font-size:100%;" &gt;Contention policies&lt;br /&gt;&lt;/span&gt;&lt;ol style="color: rgb(0, 0, 0);"&gt;&lt;li&gt;&lt;span style="font-style: italic;font-size:100%;" &gt;Aggressive. &lt;/span&gt;&lt;span style="font-size:100%;"&gt;The acquiring transaction terminates any conflicting transaction&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-style: italic;font-size:100%;" &gt;Polite.&lt;/span&gt;&lt;span style="font-size:100%;"&gt; The acquiring transaction uses exponential backoff to delay for a fixed number of exponentially growing intervals before aborting the other transaction. After each interval, the transaction checks if the other transaction has finished with the object.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-style: italic;font-size:100%;" &gt;Timestamp.&lt;/span&gt;&lt;span style="font-size:100%;"&gt; This manager aborts any transaction that started execution after the acquiring transaction.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-style: italic;font-size:100%;" &gt;Published Timestamp.&lt;/span&gt;&lt;span style="font-size:100%;"&gt; This manager follows the timestamp policy, but also aborts older transactions that appear inactive.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-style: italic;font-size:100%;" &gt;Greed&lt;/span&gt;&lt;span style="font-size:100%;"&gt;y. This manager aborts the transaction that has executed the least amount of time&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-style: italic;font-size:100%;" &gt;Kindergarten.&lt;/span&gt;&lt;span style="font-size:100%;"&gt; This manager maintains a “hit list” of transactions to which a given transaction previously deferred. If the transaction holding the object is on the list, the acquiring transaction immediately terminates it. If it is not on the list, the acquiring transaction adds it to the list and then backs off for a fixed interval before aborting itself. This policy ensures that two transactions sharing an object take turns aborting (hence the name).&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-style: italic;font-size:100%;" &gt;Karma. &lt;/span&gt;&lt;span style="font-size:100%;"&gt;This manager uses a count of the number of objects that a transaction has opened (cumulative, across all of its aborts and reexecutions) as a priority. An acquiring transaction immediately aborts another transaction with lower priority. If the acquiring transaction’s priority is lower, it backs off and tries to reacquire the object N times, where N is the difference in priorities, before aborting the other transaction.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-style: italic;font-size:100%;" &gt;Eruption. &lt;/span&gt;&lt;span style="font-size:100%;"&gt;This manager is similar to Karma, except that it adds the blocked transaction’s priority to the active transaction’s priority; to help reduce the possibility that a third transaction will subsequently abort the active transaction.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-style: italic;font-size:100%;" &gt;Polka. &lt;/span&gt;&lt;span style="font-size:100%;"&gt;This is a combination of the Polite and Karma policies. The key change to Karma is to use exponential backoff for the N intervals.&lt;/span&gt;&lt;span style="font-weight: bold;font-size:100%;" &gt;&lt;br /&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;span style="font-weight: bold; color: rgb(0, 0, 0);font-size:100%;" &gt;&lt;span style="font-weight: bold;"&gt;SXM&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="color: rgb(0, 0, 0);font-size:100%;" &gt;&lt;span&gt;SXM is a deferred, object-based STM system implemented as a library for C# code. It is&lt;br /&gt;similar in operation to DSTM system, although implemented in .NET.&lt;/span&gt;&lt;/span&gt;&lt;span style="color: rgb(0, 0, 0);font-size:100%;" &gt;&lt;span&gt;&lt;br /&gt;&lt;table border="1"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Strong or Weak Isolation&lt;/td&gt;&lt;td&gt;Weak&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Transaction Granularity&lt;/td&gt;&lt;td&gt;Object&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Direct or Deferred Update&lt;/td&gt;&lt;td&gt;Deferred (cloned replacement)&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Concurrency Control&lt;/td&gt;&lt;td&gt;Optimistic&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Synchronization&lt;/td&gt;&lt;td&gt;Nonblocking (obstruction free)&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Conflict Detection&lt;/td&gt;&lt;td&gt;Early&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Inconsistent Reads&lt;/td&gt;&lt;td&gt;Inconsistency toleration&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Conflict Resolution&lt;/td&gt;&lt;td&gt;Explicit contention manager&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Nested Transaction&lt;/td&gt;&lt;td&gt;Closed&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Exceptions&lt;/td&gt;&lt;td&gt;&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;Key features: &lt;/span&gt;&lt;/span&gt;&lt;span style="color: rgb(0, 0, 0);font-size:100%;" &gt;&lt;span&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;ul style="color: rgb(0, 0, 0);"&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;&lt;span&gt;polymorphic contention management, which is&lt;/span&gt; a framework for managing conflicts between transactions&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;run-time code generation to produce some of the boilerplate code&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="color: rgb(0, 0, 0);font-size:100%;" &gt;&lt;span&gt;Each transaction selects a contention manager from a collection of managers that implement a&lt;br /&gt;diverse set of policies&lt;/span&gt;&lt;/span&gt;&lt;span style="font-weight: bold; color: rgb(0, 0, 0);font-size:100%;" &gt;&lt;span style="font-weight: bold;"&gt;. &lt;/span&gt;&lt;/span&gt;&lt;span style="color: rgb(0, 0, 0);font-size:100%;" &gt;&lt;span&gt;All policies are classified based on the cost of the state that they maintain&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;table style="color: rgb(0, 0, 0);" border="1"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Rank&lt;/td&gt;&lt;td&gt;Policy Class&lt;/td&gt;&lt;td&gt;Policy&lt;/td&gt;&lt;td&gt;State&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;&lt;br /&gt;&lt;/td&gt;&lt;td&gt;Aggressive, Polite&lt;/td&gt;&lt;td&gt;–&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;2&lt;br /&gt;&lt;/td&gt;&lt;td&gt;Ad hoc&lt;br /&gt;&lt;/td&gt;&lt;td&gt;Greedy, Killblocked&lt;/td&gt;&lt;td&gt;Transaction start time&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;3&lt;/td&gt;&lt;td&gt;Local&lt;br /&gt;&lt;/td&gt;&lt;td&gt;Timestamp&lt;/td&gt;&lt;td&gt;Transaction start time, variable&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;4&lt;/td&gt; &lt;td&gt;&lt;br /&gt;&lt;/td&gt;&lt;td&gt;Kindergarten&lt;/td&gt;&lt;td&gt;List of transactions&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;5&lt;/td&gt;&lt;td&gt;Historical&lt;br /&gt;&lt;/td&gt;&lt;td&gt;Karma, Polka, Eruption&lt;/td&gt;&lt;td&gt;List of objects&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;A higher–numbered policy is more expensive to compute than a lower–ranked one; but it&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;is not necessarily more predictive of future behavior. Ranking identifies policies that are comparable to each other.&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;SXM assumes that two transactions with policies from different policy classes were not intended to conflict, so neither transaction’s policy is preferable and uses the Greedy policy. &lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;If the two transactions’ policies belong in the same class, SXM applies the conflict policy from the transaction that wants to acquire the object.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; color: rgb(0, 0, 0);"&gt;ASTM&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;Adaptive STM(ASTM) is a deferred, object-based STM system by Marathe, Scherer, and Scott that explored several performance improvements to the basic design of DSTM system&lt;/span&gt;&lt;span style="font-weight: bold; color: rgb(0, 0, 0);"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;table style="color: rgb(0, 0, 0);" border="1"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Strong or Weak Isolation&lt;/td&gt;&lt;td&gt;Weak&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Transaction Granularity&lt;/td&gt;&lt;td&gt;Object&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Direct or Deferred Update&lt;/td&gt;&lt;td&gt;Deferred (cloned replacement)&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Concurrency Control&lt;/td&gt;&lt;td&gt;Optimistic&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Synchronization&lt;/td&gt;&lt;td&gt;Nonblocking (obstruction free)&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Conflict Detection&lt;/td&gt;&lt;td&gt;Early or late (selectable)&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Inconsistent Reads&lt;/td&gt;&lt;td&gt;Validation&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Conflict Resolution&lt;/td&gt;&lt;td&gt;Explicit contention manager&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Nested Transaction&lt;/td&gt;&lt;td&gt;&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Exceptions&lt;/td&gt;&lt;td&gt;&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;Key features:&lt;/span&gt;&lt;br /&gt;&lt;ul style="color: rgb(0, 0, 0);"&gt;&lt;li&gt;ASTM eliminated a memory indirection in accessing fields in an object open for reading&lt;/li&gt;&lt;li&gt;adaptive system to change its conflict resolution policy from early to late detection&lt;/li&gt;&lt;/ul&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;An object opened for update has the same representation as in DSTM, with two levels of indirection between the TMObject and the object’s fields&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;An object opened for reading by a transaction has a single level of indirection, so the TMObject points directly to the object’s fields&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic; color: rgb(0, 0, 0);"&gt;Late and early conflict detection&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;With late conflict detection, a transaction does not acquire ownership of an object that it is modifying. Instead, it modifies a cloned copy of the object and defers conflict detection until the transaction commits, which reduces the interval in which an object is locked and avoids some unnecessary transaction aborts. When the transaction commits, it may find that the object modified or may find another transaction in the process of committing changes to the object. The first case causes the transaction to abort, while the second one invokes a contention manager to resolve the conflict&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;Late and early conflict detection have comparable overhead for most benchmarks, but early detection is easier to implement. Late acquire performs better for a long running transaction that reads a large number of objects and updates only a few, because this policy allows concurrent readers and writers.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;ASTM implements an adaptive policy that recognizes a transaction that modifies few objects but reads many objects . ASTM then switches the thread executing the transaction from early to late detection for subsequent transactions. If a transaction falls below the thresholds, it reverts to early detection&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; color: rgb(0, 0, 0);"&gt;Ananian and Rinard&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;table style="color: rgb(0, 0, 0);" border="1"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Strong or Weak Isolation&lt;/td&gt;&lt;td&gt;Strong&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Transaction Granularity&lt;/td&gt;&lt;td&gt;Object&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Direct or Deferred Update&lt;/td&gt;&lt;td&gt;Deferred (in-place)&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Concurrency Control&lt;/td&gt;&lt;td&gt;Optimistic&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Synchronization&lt;/td&gt;&lt;td&gt;Nonblocking&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Conflict Detection&lt;/td&gt;&lt;td&gt;Early&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Inconsistent Reads&lt;/td&gt;&lt;td&gt;Invalidation&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Conflict Resolution&lt;/td&gt;&lt;td&gt;Abort conflicting transaction&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Nested Transaction&lt;/td&gt;&lt;td&gt;Flatten&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Exceptions&lt;/td&gt;&lt;td&gt;Terminate or abort&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;Key features:&lt;/span&gt;&lt;br /&gt;&lt;ul style="color: rgb(0, 0, 0);"&gt;&lt;li&gt;strong isolation&lt;/li&gt;&lt;li&gt;written in PROMELA, so it can be directly verified with the SPIN model checker&lt;/li&gt;&lt;/ul&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;Each Java object is extended by two fields. The first, named &lt;/span&gt;&lt;span style="font-style: italic; color: rgb(0, 0, 0);"&gt;versions&lt;/span&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;, is a linked list containing a version record for each transaction that modified a field in the object. A version record identifies the transaction and records the updated value of each modified field. The second field, named&lt;/span&gt;&lt;span style="font-style: italic; color: rgb(0, 0, 0);"&gt; &lt;/span&gt;&lt;span style="font-style: italic; color: rgb(0, 0, 0);"&gt;readers&lt;/span&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;, is a list of transactions that read a field in the object.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;A novel aspect of this system is the use of a &lt;/span&gt;&lt;span style="font-style: italic; color: rgb(0, 0, 0);"&gt;sentinel&lt;/span&gt;&lt;span style="color: rgb(0, 0, 0);"&gt; (signalling a conflicting access) value in a memory location to redirect read and write operations to the transactional data structures.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic; color: rgb(0, 0, 0);"&gt;NT-read&lt;/span&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;:&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;if it encounters sentinel, it aborts the transaction modifying the object containing the value, restores the field’s value from the most recently committed transaction’s record, clears versions list and re-reads location&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic; color: rgb(0, 0, 0);"&gt;NT-write&lt;/span&gt;&lt;span style="color: rgb(0, 0, 0);"&gt; aborts all transactions reading and writing the object and directly updates the field in the object. Readers and version lists are cleared. If the program is actually writing the sentinel value, the write instruction is treated as a short transaction to make difference with real sentinel&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic; color: rgb(0, 0, 0);"&gt;T-read&lt;/span&gt;&lt;span style="color: rgb(0, 0, 0);"&gt; first ensures that its transaction descriptor is on the object’s reader list. It aborts all uncommitted transactions that modified the object. After this, the transaction can read the field in the object and directly use any values other than the sentinel. The sentinel value, on the other hand, requires a search of the version records to find one with the same version and the updated value of the field. After this we can clear versions list.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;A &lt;/span&gt;&lt;span style="font-style: italic; color: rgb(0, 0, 0);"&gt;T-write&lt;/span&gt;&lt;span style="color: rgb(0, 0, 0);"&gt; aborts all other uncommitted transactions that read or wrote the object. It also creates, if none previously existed, a version object for the transaction. The next step is to copy the unmodified value in the field to all version records, including those of the committed transactions, so that the field can be restored if the transaction is rolled back. If the versions list does not contain a committed transaction, one must be created to hold this value. Finally, the new value can be written to the running transaction’s version record and the object field set to the sentinel value&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;This STM system aggressively resolves conflicting references to an object. A read aborts transactions in the process of modifying the object and a write aborts all transactions that accessed the object. In essence, the system implements a multireader, single-writer lock on an object.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; color: rgb(0, 0, 0);"&gt;RSTM&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;table style="color: rgb(0, 0, 0);" border="1"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Strong or Weak Isolation&lt;/td&gt;&lt;td&gt;Weak&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Transaction Granularity&lt;/td&gt;&lt;td&gt;Object&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Direct or Deferred Update&lt;/td&gt;&lt;td&gt;Deferred (cloned replacement)&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Concurrency Control&lt;/td&gt;&lt;td&gt;Optimistic&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Synchronization&lt;/td&gt;&lt;td&gt;Nonblocking (obstruction-free)&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Conflict Detection&lt;/td&gt;&lt;td&gt;Early or late (selectable)&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Inconsistent Reads&lt;/td&gt;&lt;td&gt;Bounded invalidation&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Conflict Resolution&lt;/td&gt;&lt;td&gt;Conflict manager&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Nested Transaction&lt;/td&gt;&lt;td&gt;Flatten&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Exceptions&lt;/td&gt;&lt;td&gt;&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;Key features:&lt;/span&gt;&lt;br /&gt;&lt;ul style="color: rgb(0, 0, 0);"&gt;&lt;li&gt;RSTM only uses a single level of indirection to an object, instead of the two levels used&lt;br /&gt;by previous systems such as DSTM&lt;/li&gt;&lt;li&gt;RSTM avoids dynamically allocating many of its data structures and contains its own memory collector, so it can work with nongarbage collected languages such as C++&lt;/li&gt;&lt;li&gt;RSTM uses invalidation to avoid inconsistent reads. It employs a new heuristic for&lt;br /&gt;tracking an object’s readers&lt;/li&gt;&lt;/ul&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;In RSTM, every transactional object is accessed through an &lt;/span&gt;&lt;span style="font-style: italic; color: rgb(0, 0, 0);"&gt;ObjectHeader&lt;/span&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;, which points directly to the current version of the object. RSTM uses the low-order bit of the &lt;/span&gt;&lt;span style="font-style: italic; color: rgb(0, 0, 0);"&gt;NewData&lt;/span&gt;&lt;span style="color: rgb(0, 0, 0);"&gt; field in this object as a flag&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;The &lt;/span&gt;&lt;span style="font-style: italic; color: rgb(0, 0, 0);"&gt;TransactionDescriptor&lt;/span&gt;&lt;span style="color: rgb(0, 0, 0);"&gt; referenced through an object’s header determines the transaction’s state. If the transaction commits, then NewDataObject is the current version of the object. If the transaction aborts, then OldDataObject is the current version. If the transaction is active, no other transaction can read or write the object without aborting the transaction.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;A transaction opens an object before accessing it:&lt;/span&gt;&lt;br /&gt;&lt;ol style="color: rgb(0, 0, 0);"&gt;&lt;li&gt;If opening the object for update, the transaction must first acquire the object with the following actions:&lt;br /&gt;(a) Read the object’s NewData pointer and make sure no other transaction owns it. If it is owned, invoke the contention manager.&lt;br /&gt;(b) Allocate the NewDataObject and copy values from the object’s current version.&lt;br /&gt;(c) Initialize the Owner and OldData pointers in the new object.&lt;br /&gt;(d) Use a CAS to atomically swap the pointer read in step (a) with a pointer to the newly allocated copy.&lt;br /&gt;(e) Add the object to transaction’s private write list.&lt;br /&gt;(f ) Iterate through the object’s visible reader list, aborting all transactions it contains.&lt;/li&gt;&lt;li&gt;If opening the object for reading and space is available in the object’s visible readerlist, add the transaction to this list. If the list is full, add the object to the transaction’s private read list.&lt;/li&gt;&lt;li&gt;Check the status word in the transaction’s descriptor, to make sure another transaction has not aborted it.&lt;/li&gt;&lt;li&gt;Incrementally validate all objects on the transaction’s private read list.&lt;/li&gt;&lt;/ol&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;Introducing visible reader list should have resulted in performance boost, because it reduces the cost of validation. But benchmarks show that visible readers are actually more costly, because of the extra cache traffic caused by updating the visible read table in each object&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; color: rgb(0, 0, 0);"&gt;TL&lt;/span&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;&lt;br /&gt;Dice and Shavit described an STM system called transactional locking (TL), which combined deferred update with blocking synchronization&lt;/span&gt;&lt;span style="font-weight: bold; color: rgb(0, 0, 0);"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;table style="color: rgb(0, 0, 0);" border="1"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Strong or Weak Isolation&lt;/td&gt;&lt;td&gt;Weak&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Transaction Granularity&lt;/td&gt;&lt;td&gt;Object, word or region&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Direct or Deferred Update&lt;/td&gt;&lt;td&gt;Deferred (in-place)&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Concurrency Control&lt;/td&gt;&lt;td&gt;Optimistic&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Synchronization&lt;/td&gt;&lt;td&gt;Blocking&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Conflict Detection&lt;/td&gt;&lt;td&gt;Early or late (selectable)&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Inconsistent Reads&lt;/td&gt;&lt;td&gt;Inconsistency toleration&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Conflict Resolution&lt;/td&gt;&lt;td&gt;Delay and abort&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Nested Transaction&lt;/td&gt;&lt;td&gt;Flatten&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Exceptions&lt;/td&gt;&lt;td&gt;&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;Key features:&lt;/span&gt;&lt;br /&gt;&lt;ul style="color: rgb(0, 0, 0);"&gt;&lt;li&gt;uses blocking to acquire an object&lt;/li&gt;&lt;/ul&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;Each object is associated with a lock. A lock is either locked, and pointing to the transaction holding exclusive access, or unlocked, and recording the object’s version, which is incremented when a transaction updates the object.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;A transaction maintains a read set and a write set. An entry in the read set contains the address of the object and the version number of the lock associated with the object. An entry in the write set contains the address of the object, the address of its lock, and the updated value of the object.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;When the transaction executes a &lt;/span&gt;&lt;span style="font-style: italic; color: rgb(0, 0, 0);"&gt;write&lt;/span&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;, it first looks for the object’s entry in its write set. If&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;it is not present, the transaction creates an entry for the object. The write modifies the entry in the write set, not the actual object. The transaction does not acquire the lock.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;A memory &lt;/span&gt;&lt;span style="font-style: italic; color: rgb(0, 0, 0);"&gt;load&lt;/span&gt;&lt;span style="color: rgb(0, 0, 0);"&gt; first checks the write set, to determine if the transaction previously updated the object. If so, it uses the updated value from the write set. If the object is not in the set, the  transaction adds the referenced object to the read set and attempts to read the actual object. If another transaction locked the object, the reading transaction can either delay and retry or abort itself.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;When a transaction &lt;/span&gt;&lt;span style="font-style: italic; color: rgb(0, 0, 0);"&gt;commits&lt;/span&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;, it first acquires locks for all objects in its write set. A transaction will only wait a bounded amount of time for a lock to be released to avoid deadlock. After acquiring locks for the objects it modified, the transaction validates its read set. If successful, the transaction can complete by copying updated values into the objects, releasing the locks, and freeing its data structures.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;Performance measurements on simple benchmarks showed that TL performed better than other STM systems, such as Harris and Fraser’s nonblocking WSTM system&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4500206154953507248-7039704485994362440?l=tm-101.blogspot.com'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tm-101.blogspot.com/feeds/7039704485994362440/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=4500206154953507248&amp;postID=7039704485994362440' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4500206154953507248/posts/default/7039704485994362440'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4500206154953507248/posts/default/7039704485994362440'/><link rel='alternate' type='text/html' href='http://tm-101.blogspot.com/2008/03/reading-5-improvements-on-dstm-and-wstm.html' title='Reading 5. Improvements on DSTM and WSTM'/><author><name>Dennis Ushakov</name><uri>http://www.blogger.com/profile/05810645469531953594</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='17188654227638298166'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4500206154953507248.post-5442465148329841934</id><published>2008-03-10T13:02:00.010+03:00</published><updated>2008-03-17T14:07:51.218+03:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='administrativa'/><title type='text'>Programme</title><content type='html'>Here is a list of readings:&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Deferred Update systems:&lt;/i&gt;&lt;br /&gt;1. DSTM, WSTM and contention management (3.4.1-3.4.4) [Oleg]&lt;br /&gt;2. Improvements on DSTM and WSTM (3.4.4-3.4.9) [Denis]&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Direct Update systems:&lt;/i&gt;&lt;br /&gt;3. McRT-STM and compiler optimizations (3.5.2-3.5.3) [Roma]&lt;br /&gt;4. Bartok STM and compiler optimizations (3.5.4) [Lena]&lt;br /&gt;&lt;span style="font-size:85%;"&gt;Skipping: Autolocker&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Language-oriented STMs:&lt;/i&gt;&lt;br /&gt;5. HSTM for Concurent Haskell and AutoCaml (3.6.3-3.6.4) [Ilya]&lt;br /&gt;&lt;span style="font-size:85%;"&gt;Skipping: discussion of exceptions thrown from atomics, real-time Java stuff&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;HTM&lt;/b&gt;&lt;br /&gt;&lt;i&gt;Precursors&lt;/i&gt;&lt;br /&gt;6. Optimistic synchronisation in hardware (4.3.2-4.3.5) [Leonid]&lt;br /&gt;7. Lock elision (4.3.6-4.3.7) [Nastasia]&lt;br /&gt;&lt;span style="font-size:85%;"&gt;Skipping: IBM 801&lt;/span&gt;&lt;br /&gt;&lt;i&gt;HTM designs&lt;/i&gt;&lt;br /&gt;8. Bounded/Large HTMs (4.4) [Kostya]&lt;br /&gt;9. Unbounded HTMs (4.5)&lt;br /&gt;10. Hybrid HTM-STMs (4.6)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4500206154953507248-5442465148329841934?l=tm-101.blogspot.com'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tm-101.blogspot.com/feeds/5442465148329841934/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=4500206154953507248&amp;postID=5442465148329841934' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4500206154953507248/posts/default/5442465148329841934'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4500206154953507248/posts/default/5442465148329841934'/><link rel='alternate' type='text/html' href='http://tm-101.blogspot.com/2008/03/programme.html' title='Programme'/><author><name>Mitya</name><uri>http://www.blogger.com/profile/07764279855846194407</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='02101617867838857180'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4500206154953507248.post-8750704729236945628</id><published>2008-03-10T12:49:00.017+03:00</published><updated>2008-03-18T16:14:41.247+03:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='readings'/><title type='text'>Reading 4. Word granularity STM</title><content type='html'>&lt;div style="text-align: right;"&gt;March 14, 2008&lt;/div&gt;&lt;br /&gt;&lt;b&gt;Harris and Fraser’s 2003 OOPSLA paper&lt;/b&gt; was the first to describe a practical STM system integrated into a programming language. They implemented WSTM word-granularity STM) in the ResearchVM from Sun Labs.&lt;br /&gt;&lt;b&gt;WSTM benefits&lt;/b&gt;&lt;ul&gt;&lt;li&gt;WSTM did not require a programmer to declare the memory locations accessed within a transaction.&lt;/li&gt;&lt;li&gt;WSTM was integrated into a modern, object-oriented language ( Java) by extending the language with the atomic operation. Strangely enough, WSTM did not exploit the object-oriented nature of Java and could support procedural languages as well.&lt;/li&gt;&lt;/ul&gt;&lt;table border="1"&gt;&lt;tr&gt;&lt;td&gt;Strong or Weak Isolation&lt;/td&gt;&lt;td&gt;Weak&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Transaction Granularity&lt;/td&gt;&lt;td&gt;Word&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Direct or Deferred Update&lt;/td&gt;&lt;td&gt;Deferred (update in place)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Concurrency Control&lt;/td&gt;&lt;td&gt;Optimistic&lt;/td&gt;&lt;/tr&gt;&lt;br /&gt;&lt;tr&gt;&lt;td&gt;Synchronization&lt;/td&gt;&lt;td&gt;Obstruction free&lt;/td&gt;&lt;tr&gt;&lt;td&gt;Conflict Detection&lt;/td&gt;&lt;td&gt;Late&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Inconsistent Reads&lt;/td&gt;&lt;td&gt;Inconsistency toleration&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Conflict Resolution&lt;/td&gt;&lt;td&gt;Helping or aborting&lt;/td&gt;&lt;/tr&gt;&lt;br /&gt;&lt;tr&gt;&lt;td&gt;Nested Transaction&lt;/td&gt;&lt;td&gt;Flattened&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Exceptions&lt;/td&gt;&lt;td&gt;Terminate&lt;/td&gt;&lt;/tr&gt;&lt;br /&gt;&lt;/table&gt;&lt;br /&gt;WSTM extended Java with a new statement: &lt;code&gt;&lt;pre&gt;&lt;br /&gt;atomic (condition) { &lt;br /&gt;   statements; &lt;br /&gt;} &lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;A modified JIT ( Just-In-Time, i.e., run-time) compiler translated this statement into &lt;code&gt;&lt;pre&gt;&lt;br /&gt;bool done = false; &lt;br /&gt;   while (!done) { &lt;br /&gt;            &lt;b&gt;STMStart()&lt;/b&gt;; &lt;br /&gt;            try { &lt;br /&gt;                if (condition) { &lt;br /&gt;                    statements; &lt;br /&gt;                    done = &lt;b&gt;STMCommit()&lt;/b&gt;; &lt;br /&gt;                } else { &lt;br /&gt;                    &lt;b&gt;STMWait()&lt;/b&gt;; &lt;br /&gt;                } &lt;br /&gt;            } catch (Exception t) { &lt;br /&gt;              done = &lt;b&gt;STMCommit()&lt;/b&gt;; &lt;br /&gt;              if (done) { &lt;br /&gt;                  throw t; &lt;br /&gt;              } &lt;br /&gt;            } &lt;br /&gt;   } &lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;b&gt;Note&lt;/b&gt;: an exception within the atomic region’s predicate or body causes the transaction to commit. Subsequent systems more typically treated an exception as an error that aborts a transaction.&lt;br /&gt;The code produced by the compiler relies on five primitive operations provided by the &lt;code&gt;&lt;pre&gt;&lt;br /&gt;void   STMStart() &lt;br /&gt;void   STMAbort() &lt;br /&gt;bool   STMCommit() &lt;br /&gt;bool   STMValidate() &lt;br /&gt;void   STMWait()&lt;/pre&gt;&lt;/code&gt;In addition, all references to object fields from statements within an atomic region are replaced by calls to an appropriate library operation: &lt;code&gt;&lt;pre&gt;&lt;br /&gt;STMWord STMRead(Addr a) &lt;br /&gt;void STMWrite(Addr a, STMWord w) &lt;/pre&gt;&lt;/code&gt;&lt;b&gt;Restrictions&lt;/b&gt;: Because the JVM’s JIT compiler performs this translation, only references from Java bytecodes, not those in native methods, are translated to access these auxiliary structures. A few native methods were hand translated and included in a WSTM library, but a call on most native methods (including those that perform IO operations) from a transaction would cause a run-time error.&lt;code&gt;&lt;pre&gt;&lt;br /&gt;enum TransactionStatus { ACTIVE, COMMITTED, ABORTED, ASLEEP }; &lt;br /&gt;class TransactionEntry { &lt;br /&gt;     public Addr loc; &lt;br /&gt;     public STMWord oldValue; &lt;br /&gt;     public STMWord oldVersion; &lt;br /&gt;     public STMWord newValue; &lt;br /&gt;     public STMWord newVersion; &lt;br /&gt;} &lt;br /&gt;class TransactionDescriptor { &lt;br /&gt;     public TransactionStatus status = ACTIVE; &lt;br /&gt;     int nestingDepth = 0; &lt;br /&gt;     public Set&lt;TransactionEntry&gt; entries; &lt;br /&gt;} &lt;/pre&gt;&lt;/code&gt;The status field records the transaction status. A transaction starts in state ACTIVE and makes a transition into one of the three other states. The nestingDepth field records the number of (flattened) transactions sharing this descriptor. &lt;br /&gt;The entries field holds a TransactionEntry record for each location the transaction &lt;br /&gt;reads or writes. The compiler redirects memory reads and writes to the appropriate descriptor. A TransactionEntry records a location’s original value (before the transaction’s first access) and its current value. For a location only read, the two values are identical. A transaction increments the version number when it modifies the location. The system uses the version number to detect conflicts. &lt;br /&gt;&lt;br /&gt;OwnershipRec records the version number of the memory location, produced by the most recent committed transaction that updated the location. Second, when a transaction is in the process of committing, an OwnershipRec records the transaction that has acquired exclusive ownership of the location. Each ownership record holds either a version number or a pointer to the transaction that owns the location: &lt;code&gt;&lt;pre&gt;&lt;br /&gt;class OwnershipRec { &lt;br /&gt;     union { &lt;br /&gt;         public STMWord version; &lt;br /&gt;         public TransactionDescriptor* trans; &lt;br /&gt;     } val; &lt;br /&gt;     public bool HoldsVersion() { return (val.version &amp; 0x1) != 0; } &lt;br /&gt;} &lt;/pre&gt;&lt;/code&gt;The function FindOwnershipRec(a) maps memory address a to its associated OwnershipRec. The function CASOwnershipRec(a, old, new) performs a compare-and-swap operation on the OwnershipRec for memory address a, replacing it with value new, if the existing entry is equal to old. &lt;code&gt;&lt;pre&gt;&lt;br /&gt;void STMStart() { &lt;br /&gt;   if (ActiveTrans == null || ActiveTrans.status != TransactionStatus.ACTIVE) { &lt;br /&gt;        ActiveTrans = new TransactionDescriptor(); &lt;br /&gt;        ActiveTrans.status = TransactionStatus.ACTIVE; &lt;br /&gt;   } &lt;br /&gt;   AtomicAdd(ActiveTrans.nestingDepth, 1); &lt;br /&gt;} &lt;br /&gt;&lt;br /&gt;void STMAbort() { &lt;br /&gt;   ActiveTrans.status = TransactionStatus.ABORTED; &lt;br /&gt;   ActiveTrans.entries = null; &lt;br /&gt;   AtomicAdd(ActiveTrans.nestingDepth, -1); &lt;br /&gt;} &lt;br /&gt;&lt;br /&gt;struct ValVersion { &lt;br /&gt;   public STMWord val; &lt;br /&gt;   public STMWord version; &lt;br /&gt;} &lt;br /&gt;&lt;br /&gt;STMWord STMRead(Addr a) { &lt;br /&gt;   TransactionEntry* te = ActiveTrans.entries.Find(a); &lt;br /&gt;   if (null == te) { &lt;br /&gt;   // No entry in transaction descriptor. Create new entry (get value from memory) &lt;br /&gt;   // and add it to descriptor. &lt;br /&gt;      ValVersion vv = MemRead(a); &lt;br /&gt;      te = new TransactionEntry(a, vv.val, vv.version, vv.val, vv.version); &lt;br /&gt;      ActiveTrans.entries.Add(a, te); &lt;br /&gt;      return vv.val; &lt;br /&gt;   } else { &lt;br /&gt;   // Entry already exists in descriptor, so return its (possibly updated) value. &lt;br /&gt;      return te.newValue; &lt;br /&gt;   } &lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;void STMWrite(Addr a, STMWord w) { &lt;br /&gt;    STMRead(a); // Create entry if necessary. &lt;br /&gt;    TransactionEntry te = ActiveTrans.entries.Find(a); &lt;br /&gt;    te.newValue = w; &lt;br /&gt;    te.newVersion += 2; // Version numbers are odd numbers. &lt;br /&gt;} &lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;The function MemRead returns the value of a memory location, along with its version number:&lt;ul&gt;&lt;li&gt;If no other transaction accessed the location and started committing, then the current value resides in the memory location and its ownership record contains the version number.&lt;/li&gt;&lt;li&gt;If another transaction accessed the location and committed, the value is in the transaction’s newValue field and the version in the newVersion field.&lt;/li&gt;&lt;li&gt;If another transaction accessed the location and has started, but not finished committing, the value is stored in the transaction’s oldValue field and the version in its oldVersion&lt;/li&gt;&lt;/ul&gt;STMCommit first acquires the ownership records for all locations accessed by the transaction. If successful, STMCommit changes the transaction’s state to COMMITTED, copies the modified values to memory, and releases the ownership records. &lt;br /&gt;These three steps appear logically atomic to concurrent transactions because the committing transaction’s status changes atomically (and irrevocably) from ACTIVE to COMMITTED using an atomic read-modify-write operation. Once this change occurs, MemRead will return the updated value, even before the transaction copies value back to memory. &lt;code&gt;&lt;pre&gt;&lt;br /&gt;void STMCommit() { &lt;br /&gt;     // Only outermost nested transaction can commit. &lt;br /&gt;     if (AtomicAdd(ActiveTrans.nestingDepth, -1) != 0) { return; } &lt;br /&gt;     // A nested transaction already aborted this transaction. &lt;br /&gt;     if (ActiveTrans.status == TransactionStatus.ABORTED) { return; } &lt;br /&gt;       // Acquire ownership of all locations accessed by transaction. &lt;br /&gt;      int i; &lt;br /&gt;      for (i = 0; i &lt; ActiveTrans.entries.Size(); i++) { &lt;br /&gt;           TransactionEntry* te = ActiveTrans.entries[i]; &lt;br /&gt;           switch (acquire(te)) { &lt;br /&gt;               case TRUE: { continue; } &lt;br /&gt;               case FALSE: { &lt;br /&gt;                       ActiveTrans.status = TransactionStatus.ABORTED; &lt;br /&gt;                       goto releaseAndReturn; &lt;br /&gt;               } &lt;br /&gt;               case BUSY: { /* conflict resolution */ } &lt;br /&gt;           } &lt;br /&gt;      } &lt;br /&gt;       // Transaction commits. &lt;br /&gt;      ActiveTrans.status = TransactionStatus.COMMITTED; &lt;br /&gt;        //Copy modified values to memory. &lt;br /&gt;      for (i = 0; i &lt; ActiveTrans.entries.Size(); i++) { &lt;br /&gt;           TransactionEntry te = ActiveTrans.entries[i]; &lt;br /&gt;           *((STMWord*)te.loc) = te.newValue; &lt;br /&gt;      } &lt;br /&gt;   releaseAndReturn: // Release the ownership records. &lt;br /&gt;           for (int j = 0; j &lt; i; j++) { release(te); } &lt;br /&gt;   } &lt;br /&gt;   bool acquire(TransactionEntry* te) { &lt;br /&gt;           OwnershipRec orec = CASOwnershipRec(te.loc, te.oldVersion, &lt;br /&gt;                                        ActiveTrans); &lt;br /&gt;           if (orec.HoldsVersion()) &lt;br /&gt;               { return orec.val.version == te.oldVersion; } &lt;br /&gt;           else { &lt;br /&gt;               if (orec.val.trans == ActiveTrans) { return true; } &lt;br /&gt;               else { return BUSY; } &lt;br /&gt;      } &lt;br /&gt;   } &lt;br /&gt;&lt;br /&gt;void release(TransactionEntry* te) { &lt;br /&gt;   if (ActiveTrans.status == TransactionStatus.COMMITTED) { &lt;br /&gt;       CASOwnershipRec(te.loc, ActiveTrans, te.newVersion); &lt;br /&gt;   } else { &lt;br /&gt;       CASOwnershipRec(te.loc, ActiveTrans, te.oldVersion); &lt;br /&gt;   } &lt;br /&gt;} &lt;/pre&gt;&lt;/code&gt;STMValidate is a read-only operation that checks the ownership records for each location accessed by the current transaction, to ensure that they are still consistent with the version the transaction initially read: &lt;code&gt;&lt;pre&gt;&lt;br /&gt;bool STMValidate() { &lt;br /&gt;    for (int i = 0; i &lt; ActiveTrans.entries.Size(); i++) { &lt;br /&gt;        TransactionEntry* te = ActiveTrans.entries[i]; &lt;br /&gt;        OwnershipRec orec = FindOwnershipRec(te.loc); &lt;br /&gt;        if (orec.val.version != te.oldVersion) { return false; } &lt;br /&gt;    } &lt;br /&gt;    return true; &lt;br /&gt;} &lt;/pre&gt;&lt;/code&gt;STMWait can be used to implement a conditional critical region by suspending the transaction until its predicate should be reevaluated. It aborts the current transaction and waits until another transaction modifies a location accessed by the first transaction. It acquires ownership of the TransactionEntry accessed by the transaction, changes the transactions status to ASLEEP, and suspends the thread running the transaction. When another transaction updates one of these locations, it will conflict with the suspended transaction!!! &lt;br /&gt;The conflict manager should allow the active transaction to complete execution and then resume the suspended transaction, which releases its ownership records and then retries the transaction: &lt;code&gt;&lt;pre&gt;&lt;br /&gt;void STMWait() { &lt;br /&gt;    int I; &lt;br /&gt;    for (i = 0; i &lt; ActiveTrans.entries.Size(); i++) { &lt;br /&gt;          TransactionEntry* te = ActiveTrans.entries[i]; &lt;br /&gt;          switch (acquire(te)) { &lt;br /&gt;              case TRUE: { continue; } &lt;br /&gt;              case FALSE: { &lt;br /&gt;                   ActiveTrans.status = TransactionStatus.ABORTED; &lt;br /&gt;                   goto releaseAndReturn; &lt;br /&gt;              } &lt;br /&gt;              case BUSY: { /* conflict resolution */ } &lt;br /&gt;         } &lt;br /&gt;   } &lt;br /&gt;   // Transaction waits, unless in conflict with another transaction and &lt;br /&gt;   // needs to immediately re-execute. &lt;br /&gt;   ActiveTrans.status = TransactionStatus.ASLEEP; &lt;br /&gt;   SuspendThread(); &lt;br /&gt;   // Release the ownership records. &lt;br /&gt;   releaseAndReturn: &lt;br /&gt;       for (int j = 0; j &lt; i; j++) { release(te); } &lt;br /&gt;} &lt;/pre&gt;&lt;/code&gt;If two transactions share a location that neither one modifies, one transaction will be aborted, since the system does not distinguish read-only locations from modified locations. &lt;br /&gt;This performance issue is easily corrected. STMWrite can set a flag isModified in a transaction entry to record a modification of the location. STMCommit should acquire ownership of modified locations and validate unmodified locations! This introduces a new transaction status READ_PHASE. The transaction remains in this state until it commits. &lt;code&gt;&lt;pre&gt;&lt;br /&gt;void STMCommit() { &lt;br /&gt;   for (int i = 0; i &lt; ActiveTrans.entries.Size(); i++) { &lt;br /&gt;      TransactionEntry* te = ActiveTrans.entries[i]; &lt;br /&gt;      if (te.isModified) { &lt;br /&gt;         switch (acquire(te)) { &lt;br /&gt;            case TRUE: { continue; } &lt;br /&gt;            case FALSE: { &lt;br /&gt;               ActiveTrans.status = TansactionStatus.ABORTED; &lt;br /&gt;               goto releaseAndReturn; &lt;br /&gt;            } &lt;br /&gt;            case BUSY: { /* conflict resolution */ } &lt;br /&gt;         } &lt;br /&gt;      } &lt;br /&gt;   } &lt;br /&gt;   ActiveTrans.status = TransactionStatus.READ_PHASE; &lt;br /&gt;   for (int i = 0; i &lt; ActiveTrans.entries.Size(); i++) { &lt;br /&gt;      TransactionEntry* te = ActiveTrans.entries[i]; &lt;br /&gt;      if (!te.isModified) { &lt;br /&gt;         ValVersion vv = MemRead(te.loc); &lt;br /&gt;         if (te.oldVersion != vv.version) { &lt;br /&gt;         // Another transaction updated this location. &lt;br /&gt;            ActiveTrans.status = TransactionStatus.ABORTED; &lt;br /&gt;            goto releaseAndReturn; &lt;br /&gt;         } &lt;br /&gt;      } &lt;br /&gt;   } &lt;br /&gt;   // Transaction commits. Write modified values to memory. &lt;br /&gt;   ActiveTrans.status = TransactionStatus.COMMITTED; &lt;br /&gt;   for (int i = 0; i &lt; ActiveTrans.entries.Size(); i++) { &lt;br /&gt;       TransactionEntry te = ActiveTrans.entries[i]; &lt;br /&gt;       *((STMWord*)te.loc) = te.newValue; &lt;br /&gt;   } &lt;br /&gt;   // Release the ownership records. &lt;br /&gt;releaseAndReturn: &lt;br /&gt;       for (int j = 0; j &lt; i; j++) { release(te); } &lt;br /&gt;}&lt;/pre&gt;&lt;/code&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4500206154953507248-8750704729236945628?l=tm-101.blogspot.com'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tm-101.blogspot.com/feeds/8750704729236945628/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=4500206154953507248&amp;postID=8750704729236945628' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4500206154953507248/posts/default/8750704729236945628'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4500206154953507248/posts/default/8750704729236945628'/><link rel='alternate' type='text/html' href='http://tm-101.blogspot.com/2008/03/wstm-harris-and-fraser-oopsla-2003.html' title='Reading 4. Word granularity STM'/><author><name>oleg</name><uri>http://www.blogger.com/profile/07810021966174359469</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='12756278884698385423'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4500206154953507248.post-1232761190537289002</id><published>2008-03-07T14:58:00.011+03:00</published><updated>2008-03-14T20:33:40.594+03:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='readings'/><title type='text'>Reading 3. Deferred STM</title><content type='html'>&lt;div style="text-align: right;"&gt;March 7, 2008&lt;/div&gt;&lt;br /&gt;&lt;h4&gt;Deferred Update STM Herlihy, Luchangco, Moir, and Scherer, PODC 2003&lt;/h4&gt;&lt;br /&gt;&lt;b&gt;DSTM Characteristics&lt;/b&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Obstruction freedom&lt;/li&gt;&lt;li&gt;Explicit contention manager, which  encapsulates the policy of resolving conflicts&lt;/li&gt;&lt;li&gt;Can release object, by reducing transaction readset&lt;/li&gt;&lt;/ul&gt;&lt;table align="center" border="1" width="70%"&gt;&lt;caption&gt;DSTM&lt;/caption&gt;&lt;tbody&gt;&lt;tr&gt;&lt;th&gt;Strong or Weak Isolation&lt;/th&gt;&lt;th&gt;Weak&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Transaction granularity&lt;/th&gt;&lt;th&gt;Object&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Update&lt;/th&gt;&lt;th&gt;Deferred (cloned replacement)&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Concurrency control&lt;/th&gt;&lt;th&gt;Optimistic&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Synchronization&lt;/th&gt;&lt;th&gt;Obstruction free&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Conflict detection&lt;/th&gt;&lt;th&gt;Early&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Incostistent reads&lt;/th&gt;&lt;th&gt;Validation&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Conflicts resolution&lt;/th&gt;&lt;th&gt;Explicit content manager&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Nested transaction&lt;/th&gt;&lt;th&gt;Flattened&lt;/th&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;br /&gt;A programmer must explicitly invoke library functions to create a transaction and to access shared objects. Transactions run on threads of a new class. The programmer must introduce and properly manipulate a container for each object involved in a transaction.&lt;br /&gt;Example of using DSTM:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;pre&gt;public bool insert(int v) {&lt;br /&gt;List newList = new List(v);&lt;br /&gt;&lt;b&gt;TM0bject&lt;/b&gt; newNode = new &lt;b&gt;TM0bject&lt;/b&gt;(newList);&lt;br /&gt;&lt;b&gt;TMThread&lt;/b&gt; thread = (&lt;b&gt;TMThread&lt;/b&gt;)Thread.currentThread();&lt;br /&gt;while (true) {&lt;br /&gt;thread.&lt;b&gt;beginTransaction&lt;/b&gt;();&lt;br /&gt;bool result = true;&lt;br /&gt;try {&lt;br /&gt;   List prevList = (List)this.first. &lt;b&gt;open(WRITE)&lt;/b&gt;;&lt;br /&gt;   List currList = (List)prevList.next. &lt;b&gt;open(WRITE)&lt;/b&gt;;&lt;br /&gt;   while (eurrList.value &lt;&gt; v) {&lt;br /&gt;       prevList = currList;&lt;br /&gt;       currList = (List)currList.next. &lt;b&gt;open(WRITE)&lt;/b&gt;;&lt;br /&gt;   }&lt;br /&gt;   if (currList.value == v) { result = false; }&lt;br /&gt;   else {&lt;br /&gt;       result = true;&lt;br /&gt;       newList.next = prevList.next;&lt;br /&gt;       prevList.next = newNode;&lt;br /&gt;   }&lt;br /&gt;} catch (Denied d) {}&lt;br /&gt;if (thread. &lt;b&gt;commitTransaction()&lt;/b&gt;) {&lt;br /&gt;   return result;&lt;br /&gt;}&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;The TMThread class extends the Java Thread class:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;pre&gt;class TMThread : Thread {&lt;br /&gt;  void beginTransaction();&lt;br /&gt;  bool commitTransaction();&lt;br /&gt;  void abortTransaction();&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Transaction references an object through a TMObject.The open operation prepares a TMObject to be manipulated by a transaction and exposes the underlying object to the code in the transaction. The actions that open performs depend on whether an object is open for reading or writing.&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;pre&gt;class TMObject {&lt;br /&gt;  private class Locator {&lt;br /&gt;    public Transaction trans;&lt;br /&gt;    public Object oldVersion;&lt;br /&gt;    public Object newVersion;&lt;br /&gt;  }&lt;br /&gt;  TMObject(Object obj);&lt;br /&gt;  enum Mode { READ, WRITE };&lt;br /&gt;  Object open(Mode mode);&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;The current version of an object is found through the object’s Locator.&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;If the Locator does not contain a transaction, the current version is the original object (oldVersion).&lt;/li&gt;&lt;br /&gt;&lt;li&gt;If the Locator points to some transaction, we check it`s status:&lt;br /&gt;   &lt;ol&gt;&lt;br /&gt;       &lt;li&gt;&lt;b&gt;COMMITTED&lt;/b&gt;. The current version is the one modified by the&lt;br /&gt;           transaction (newVersion).&lt;br /&gt;       &lt;/li&gt;&lt;br /&gt;       &lt;li&gt;&lt;b&gt;ABORTED&lt;/b&gt;. The current version is the original object&lt;br /&gt;           (oldVersion).&lt;br /&gt;       &lt;/li&gt;&lt;br /&gt;       &lt;li&gt;&lt;b&gt;ACTIVE&lt;/b&gt;. Conflict! The contention manager must resolve the&lt;br /&gt;           conflict by aborting or delaying one of the transactions.&lt;br /&gt;Note: it`s only the place, where we ask   the contention manager! All other conflicts mean inconsistency and   we have no choise: we have to abort current transaction!&lt;br /&gt;       &lt;/li&gt;&lt;br /&gt;   &lt;/ol&gt;&lt;br /&gt;&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;       &lt;br /&gt;DSTM adds two levels of indirection to an object:&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;&lt;b&gt;READ&lt;/b&gt; - adds to current transaction read set pair (TMObject and currentVersion), then validate current transaction&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;pre&gt;// Record the TMObject and its current value (version) in transaction’s read table.&lt;br /&gt;curTrans.recordRead(this, currentVersion(locInfo));&lt;br /&gt;if (!curTrans.validate()) { throw new Denied(); }&lt;br /&gt;return version;&lt;/pre&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;&lt;b&gt;WRITE&lt;/b&gt; - create new Locator, get current version of TMObject, clone it and validate&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;pre&gt;// Create a new Locator pointing to a local copy of the object and install it.&lt;br /&gt;Locator newLocInfo = new Locator();&lt;br /&gt;newLocInfo.trans = curTras;&lt;br /&gt;// Actually it is just a spin lock to ensure, that no one has modified current&lt;br /&gt;object`s locator&lt;br /&gt;do {&lt;br /&gt;  Locator oldLocInfo = locInfo;&lt;br /&gt;  // Note: We can get conflict in currentVersion&lt;br /&gt;  newLocInfo.oldVersion = currentVersion(oldLocInfo);&lt;br /&gt;  newLocInfo.newVersion = newLocInfo.oldVersion.clone();&lt;br /&gt;} while (CAS(locInfo, oldLocInfo, newLocInfo) != oldLocInfo);&lt;br /&gt;if (!trans.validate()) { throw new Denied(); }&lt;br /&gt;return newLocInfo.newVersion;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Validating the transaction’s consistency relies on the read set. DSTM compares each object entry in a transaction’s read set against the current version of the object (obtained by following the TMObject reference). If the objects differ, the transaction should abort since it is in inconsistent state.&lt;br /&gt;&lt;br /&gt;A transaction commits by validating its read set, and if that operation succeeds, by&lt;br /&gt;changing its status from ACTIVE to COMMITTED.&lt;br /&gt;&lt;br /&gt;Note: We don`t need modify objects in memory! This modification(ACTIVE -&gt; COMMITED) makes all of the transaction’s modified objects into the current version of the respective objects!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4500206154953507248-1232761190537289002?l=tm-101.blogspot.com'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tm-101.blogspot.com/feeds/1232761190537289002/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=4500206154953507248&amp;postID=1232761190537289002' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4500206154953507248/posts/default/1232761190537289002'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4500206154953507248/posts/default/1232761190537289002'/><link rel='alternate' type='text/html' href='http://tm-101.blogspot.com/2008/03/deferred-stm.html' title='Reading 3. Deferred STM'/><author><name>oleg</name><uri>http://www.blogger.com/profile/07810021966174359469</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='12756278884698385423'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4500206154953507248.post-7425577893425085811</id><published>2008-03-02T13:59:00.005+03:00</published><updated>2008-03-02T14:15:51.668+03:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='side papers'/><category scheme='http://www.blogger.com/atom/ns#' term='readings'/><title type='text'>Clarification: Invalidation Policies</title><content type='html'>I skimmed through the original paper by &lt;a href="http://www.cs.rochester.edu/u/scott/papers/2006_TRANSACT_formal_STM.pdf"&gt;Michael Scott "Sequential Specification of Transactional Memory Semantics"&lt;/a&gt;, which introduced classification of invalidation policies (lazy, eager W-R, mixed and eager), that was a stumbling block at our last reading.&lt;br /&gt;&lt;br /&gt;Apparently our understanding of what was meant turns out to be correct: Scott introduces a notion of &lt;i&gt;transactional memory history&lt;/i&gt; as essentially a sequence of events which include reading and writing memory and commiting and aborting transactions (each event is annotated with a transaction), that he says that predicate &lt;i&gt;C(H,s,t)&lt;/i&gt; (where H is a history and s and t are transactions) is a &lt;i&gt;conflict function&lt;/i&gt; if C(H,s,t) satisfies certain rules (mainly asserting that non-overlapping transactions do not conflict), and then he classifies conflict functions into lazy, eager W-R, mixed or eager depending on what kind of histories particular conflict function classifies as a conflict.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4500206154953507248-7425577893425085811?l=tm-101.blogspot.com'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tm-101.blogspot.com/feeds/7425577893425085811/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=4500206154953507248&amp;postID=7425577893425085811' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4500206154953507248/posts/default/7425577893425085811'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4500206154953507248/posts/default/7425577893425085811'/><link rel='alternate' type='text/html' href='http://tm-101.blogspot.com/2008/03/clarification-on-invalidation-policies.html' title='Clarification: Invalidation Policies'/><author><name>Mitya</name><uri>http://www.blogger.com/profile/07764279855846194407</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='02101617867838857180'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4500206154953507248.post-752999612617694129</id><published>2008-03-02T10:53:00.005+03:00</published><updated>2008-03-02T11:31:38.934+03:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='side papers'/><title type='text'>Garbage Collection vs. Transactional Memory</title><content type='html'>Dan Grossman's paper &lt;a href="http://portal.acm.org/citation.cfm?id=1297105.1297080"&gt;&lt;span style="font-style: italic;"&gt;"The Trasactional Memory / Garbage Collection Analogy"&lt;/span&gt;&lt;/a&gt; argues that&lt;br /&gt;&lt;div style="text-align: center;"&gt;&lt;span style="font-style: italic;"&gt;Transactional Memory is to shared-memory concurrency&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;as&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Garbage Collection is to memory management&lt;/span&gt;&lt;br /&gt;&lt;div style="text-align: justify; font-style: italic;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;Here is a summarizing list of similiarities:&lt;table&gt;&lt;tbody&gt;&lt;tr&gt;&lt;th style="text-align: center;"&gt;GC Term&lt;/th&gt;&lt;th style="text-align: center;"&gt;TM Term&lt;/th&gt;&lt;/tr&gt;&lt;br /&gt;&lt;tr&gt;&lt;td&gt;memory management&lt;/td&gt;&lt;td&gt;concurrency&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;dangling pointers&lt;/td&gt;&lt;td&gt;races&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;space exhaustion&lt;/td&gt;&lt;td&gt;deadlock&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;regions&lt;/td&gt;&lt;td&gt;locks&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;garbage collection&lt;/td&gt;&lt;td&gt;transactional memory&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;reachability&lt;/td&gt;&lt;td&gt;memory conflicts&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;nursery data&lt;/td&gt;&lt;td&gt;thread-local data&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;weak pointers&lt;/td&gt;&lt;td&gt;open nesting&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;I/O of pointers&lt;/td&gt;&lt;td&gt;I/O in trasactions&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;tracing&lt;/td&gt;&lt;td&gt;deferred update&lt;/td&gt;&lt;/tr&gt;&lt;br /&gt;&lt;tr&gt;&lt;td&gt;automatic reference counting&lt;/td&gt;&lt;td&gt;direct update&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;conservative collection&lt;/td&gt;&lt;td&gt;false memory conflicts&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;real-time collection&lt;/td&gt;&lt;td&gt;obstruction freedom&lt;/td&gt;&lt;/tr&gt;&lt;br /&gt;&lt;tr&gt;&lt;td&gt;liveness analysis&lt;/td&gt;&lt;td&gt;escape analysis&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;br /&gt;Be careful though: the analogy is a bad guide for &lt;i&gt;studying&lt;/i&gt; TM. I believe we should first understand all the nuances and problems of implementing TM in its own right, and only then can we think of connections to Garbage Collection.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:85%;"&gt;&lt;span style="font-style: italic;"&gt;Mitya has the full text for the article (I believe ACM copyright allows to make copies for classroom use)&lt;/span&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4500206154953507248-752999612617694129?l=tm-101.blogspot.com'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tm-101.blogspot.com/feeds/752999612617694129/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=4500206154953507248&amp;postID=752999612617694129' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4500206154953507248/posts/default/752999612617694129'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4500206154953507248/posts/default/752999612617694129'/><link rel='alternate' type='text/html' href='http://tm-101.blogspot.com/2008/03/garbage-collection-vs-transactional.html' title='Garbage Collection vs. Transactional Memory'/><author><name>Mitya</name><uri>http://www.blogger.com/profile/07764279855846194407</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='02101617867838857180'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4500206154953507248.post-3705604879359974104</id><published>2008-02-25T02:08:00.012+03:00</published><updated>2008-03-02T13:59:18.494+03:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='readings'/><title type='text'>Reading 2. Taxonomy and Implementation</title><content type='html'>&lt;div style="text-align: right;"&gt;February 29th, 2008&lt;/div&gt;&lt;br /&gt;&lt;h4&gt;Granularity&lt;/h4&gt;&lt;br /&gt;Object granularity/Word granularity&lt;br /&gt;&lt;br /&gt;&lt;h4&gt;Direct/Deferred Update&lt;/h4&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Deferred update&lt;/span&gt;: transactions modify private copies of objects, and copy to public space on commit&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Direct update&lt;/span&gt;: transactions modify objects in place, revert modifications on rollback.&lt;br /&gt;In STM, direct update appears to be faster.&lt;br /&gt;&lt;br /&gt;&lt;h4&gt;Concurrency control&lt;/h4&gt;&lt;br /&gt;Conflict:&lt;ul&gt;&lt;li&gt;&lt;span style="font-style: italic;"&gt;Occurs&lt;/span&gt; when transactions perform conflicting operation on memory location&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-style: italic;"&gt;Is detected&lt;/span&gt; when TM system is aware of that conflict&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-style: italic;"&gt;Is resolved&lt;/span&gt; when TM system takes action to ensure correctness (delays or aborts transaction)&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;These three events happen in that order, but at different times.&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Pessimistic concurrency control&lt;/span&gt;: all three events happen at the same time.&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Optimistic concurrency control&lt;/span&gt;: TM system postpones detection and resolution.&lt;br /&gt;&lt;br /&gt;Progress Guarantees:&lt;ul&gt;&lt;li&gt;&lt;span style="font-style: italic;"&gt;Wait freedom&lt;/span&gt;: all threads contending over a set of objects make forward progress in finite steps&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-style: italic;"&gt;Lock freedom&lt;/span&gt;: at least one thread contending over a set of objects make forward progress&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-style: italic;"&gt;Obstruction freedom&lt;/span&gt;: thread makes progress in the absense of contention over shared objects&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;&lt;h4&gt;Conflict detection&lt;/h4&gt;&lt;br /&gt;A conflict can be:&lt;br /&gt;&lt;ol type="a"&gt;&lt;li&gt;&lt;span style="font-style: italic;"&gt;Detected on open&lt;/span&gt;: when transaction declares its intent of accessing an object&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-style: italic;"&gt;Detected on validation&lt;/span&gt;: at some point during transaction execution&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-style: italic;"&gt;Detected on commit&lt;/span&gt;: extreme case of validation, just before transaction commits (essentially a must, unless all conflicts are detected on open)&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;Validation should happen either by value or by version number (latter avoids ABA problem)&lt;br /&gt;&lt;br /&gt;Early conflict detection may terminate the transaction that may have commited.&lt;br /&gt;(TB and TC conflict with TA over two different objects...)&lt;br /&gt;&lt;br /&gt;Late conflict detection discards more computation.&lt;br /&gt;&lt;br /&gt;How to detect conflicts: either &lt;span style="font-style: italic;"&gt;read/write sets&lt;/span&gt; (objects accessed by transaction; can be private or public) or &lt;span style="font-style: italic;"&gt;reader/writer sets&lt;/span&gt; (transactions accessing objects).&lt;br /&gt;&lt;br /&gt;&lt;h4&gt;Invalidation Policies&lt;/h4&gt;&lt;br /&gt;&lt;a href="http://www.cs.rochester.edu/u/scott/papers/2006_TRANSACT_formal_STM.pdf"&gt;See Michael Scott's paper for a formal discussion&lt;/a&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Lazy&lt;/span&gt;: TA and TB conflict if TA writes (an object), TB reads (the same object), and TA commits before TB&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Eager W-R&lt;/span&gt;: Lazy or TA writes, TB reads, but neither commit&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Mixed&lt;/span&gt;: Lazy or TA reads, TB reads and writes, neither commit&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Eager&lt;/span&gt;: Eager W-R or TB reads, TA writes, neither commits&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;Lazy &amp;lt; Eager W-R &amp;lt; Eager&lt;br /&gt;Lazy &amp;lt; Mixed &amp;lt; Eager&lt;br /&gt;&lt;br /&gt;&lt;h4&gt;Doing something about conflicts&lt;/h4&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Validation&lt;/span&gt;: check the read set&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Invalidation&lt;/span&gt;: check the reader set&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Inconsistency toleration&lt;/span&gt;: allow transaction to continue in inconsistent state and recover from consequences (validate on throwing exception, timeout non-terminating loops &amp;amp;c)&lt;br /&gt;&lt;br /&gt;Particulary bad example:&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Thread 1:&lt;/span&gt;&lt;br /&gt;&lt;pre&gt;&lt;code&gt;ListNode res;&lt;br /&gt;atomic {&lt;br /&gt;  res = lHead;&lt;br /&gt;  if (lHead != null)&lt;br /&gt;    lHead = lHead.next;&lt;br /&gt;}&lt;br /&gt;use res several times&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Thread 2:&lt;/span&gt;&lt;br /&gt;&lt;pre&gt;&lt;code&gt;atomic {&lt;br /&gt;  ListNode n = lHead;&lt;br /&gt;  while (n != null) {&lt;br /&gt;    n.val++;&lt;br /&gt;    n = n.next;&lt;br /&gt;  }&lt;br /&gt;}&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;If read set is private (only visible to a thread that keeps it) and inconsistency is tolerated, it is possible that thread 1 will see modified value and then unmodified value of &lt;code&gt;res.val&lt;/code&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4500206154953507248-3705604879359974104?l=tm-101.blogspot.com'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tm-101.blogspot.com/feeds/3705604879359974104/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=4500206154953507248&amp;postID=3705604879359974104' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4500206154953507248/posts/default/3705604879359974104'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4500206154953507248/posts/default/3705604879359974104'/><link rel='alternate' type='text/html' href='http://tm-101.blogspot.com/2008/02/reading-2-taxonomy-and-implementation.html' title='Reading 2. Taxonomy and Implementation'/><author><name>Mitya</name><uri>http://www.blogger.com/profile/07764279855846194407</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='02101617867838857180'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4500206154953507248.post-7405512407269037799</id><published>2008-02-25T01:19:00.007+03:00</published><updated>2008-02-25T02:13:13.319+03:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='readings'/><title type='text'>Reading 1. Basic Concepts and Design Space</title><content type='html'>&lt;div style="text-align: right;"&gt;February 22nd, 2008&lt;/div&gt;&lt;br /&gt;&lt;h4&gt;Main Syntactic Construct&lt;/h4&gt;&lt;br /&gt;&lt;pre&gt;&lt;code&gt;&lt;br /&gt;atomic {&lt;br /&gt;  x.Bar();&lt;br /&gt;  y = x.Baz(); // (1)&lt;br /&gt;  y.Foo();&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;atomic {&lt;br /&gt;  y = null; // (2)&lt;br /&gt;}&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;TM system guarantees that code inside atomic blocks runs as if there are no other concurrent threads.&lt;br /&gt;In the example, there is no data race between (1) and (2).&lt;br /&gt;TM system should detect transaction memory conflicts and abort (and restart) one of the transactions.&lt;br /&gt;&lt;br /&gt;&lt;h4&gt;Operational Semantics&lt;/h4&gt;&lt;br /&gt;Replace atomic with &lt;code&gt;synchronized(MasterLock)&lt;/code&gt;. The correct TM-system implementation should be equivalent.&lt;br /&gt;Real-life system should do better!&lt;br /&gt;&lt;br /&gt;TM is not a concurrent programming panacea:&lt;br /&gt;&lt;pre&gt;&lt;code&gt;bool flagA = false;&lt;br /&gt;bool flagB = false;&lt;br /&gt;&lt;br /&gt;// Th1:&lt;br /&gt;atomic {&lt;br /&gt;  while(!flagA);&lt;br /&gt;  flagB = true;&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;// Th2&lt;br /&gt;atomic {&lt;br /&gt;  flagA = true;&lt;br /&gt;  while(!flagB)&lt;br /&gt;}&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;Th1 and Th2 deadlock! If we put smaller atomics, they will work.&lt;br /&gt;&lt;br /&gt;&lt;h4&gt;Transaction Properties In TM System&lt;/h4&gt;&lt;br /&gt;TM-guaranteed properties:&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Isolation &lt;/span&gt;- while transaction executes, no other transaction sees its changes, and vice versa.&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Atomicity&lt;/span&gt; - transaction either does all changes it does to memory, or appears not to execute at all.&lt;br /&gt;Not guaranteed:&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Consistency&lt;/span&gt; - cannot be specified independently of a particular program&lt;br /&gt;&lt;br /&gt;&lt;h4&gt;Exceptions&lt;/h4&gt;&lt;br /&gt;Exceptions thrown from inside atomic should commit transaction. Otherwise, complicated handling of exception object should be implemented.&lt;br /&gt;&lt;br /&gt;&lt;h4&gt;Additional Features&lt;/h4&gt;&lt;br /&gt;&lt;code&gt;retry&lt;/code&gt; - implementing conditional variables&lt;br /&gt;&lt;pre&gt;&lt;code&gt;atomic {&lt;br /&gt;  if (buffer.IsEmpty()) retry;&lt;br /&gt;  var value = buffer.GetFirst();&lt;br /&gt;  ...&lt;br /&gt;}&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;Will only work if sufficient progress guarantees are provided by TM system (&lt;span style="font-style: italic;"&gt;wait freedom&lt;/span&gt; or &lt;span style="font-style: italic;"&gt;lock freedom&lt;/span&gt; - see next lecture)&lt;br /&gt;&lt;br /&gt;&lt;code&gt;orElse&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;&lt;h4&gt;Weak Or Strong Isolation&lt;/h4&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Weak Isolation&lt;/span&gt;: transactions are only isolated from other transactions.&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Strong Isolation&lt;/span&gt;: transactions are isolated from non-transactional code too (as if all memory accesses outide programmer-written atomics are surrounded in small atomics too)&lt;br /&gt;&lt;br /&gt;&lt;h4&gt;Nested Transactions&lt;/h4&gt;&lt;br /&gt;Generally, when nested transaction commits, its changes are only seen by the parent transaction.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;flattened nested transactions&lt;/i&gt;: aborting nested aborts its parent&lt;br /&gt;&lt;i&gt;closed nested transactions&lt;/i&gt;: abroting nested only aborts itself.&lt;br /&gt;&lt;br /&gt;However, open nested transactions are useful.&lt;br /&gt;&lt;i&gt;open nested transactions&lt;/i&gt;: commit of nested open transaction is immediately seen by all.&lt;br /&gt;Reduces conflicts (e.g. &lt;code&gt;gensym()&lt;/code&gt;)&lt;br /&gt;&lt;br /&gt;&lt;h4&gt;Exceptions&lt;/h4&gt;&lt;br /&gt;Should commit transaction, otherewise non-trivial handling of exception object should be implemented.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4500206154953507248-7405512407269037799?l=tm-101.blogspot.com'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tm-101.blogspot.com/feeds/7405512407269037799/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=4500206154953507248&amp;postID=7405512407269037799' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4500206154953507248/posts/default/7405512407269037799'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4500206154953507248/posts/default/7405512407269037799'/><link rel='alternate' type='text/html' href='http://tm-101.blogspot.com/2008/02/reading-1-basic-concepts-and-design.html' title='Reading 1. Basic Concepts and Design Space'/><author><name>Mitya</name><uri>http://www.blogger.com/profile/07764279855846194407</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='02101617867838857180'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4500206154953507248.post-8408369561440625421</id><published>2008-02-25T01:09:00.002+03:00</published><updated>2008-02-25T02:40:13.704+03:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='administrativa'/><title type='text'>Intro</title><content type='html'>This is a blog for experimental Transactional Memory seminar at Math-Mech SPbSU.&lt;br /&gt;We meet every Friday at 9:30 at 3502.&lt;br /&gt;The idea is to read &lt;a href="http://www.morganclaypool.com/doi/abs/10.2200/S00070ED1V01Y200611CAC002"&gt;J.R. Larus, R.Rajwar &lt;/a&gt;&lt;span style="font-style: italic;"&gt;&lt;a href="http://www.morganclaypool.com/doi/abs/10.2200/S00070ED1V01Y200611CAC002"&gt;Transactional Memory&lt;/a&gt; &lt;/span&gt; and underlying papers along the way.&lt;br /&gt;(Mitya has an electronic version, Ilya Sergey has a paper copy)&lt;br /&gt;&lt;br /&gt;We'll post schedule and announcements here, as well as some sort of lecture notes.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4500206154953507248-8408369561440625421?l=tm-101.blogspot.com'/&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://tm-101.blogspot.com/feeds/8408369561440625421/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=4500206154953507248&amp;postID=8408369561440625421' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4500206154953507248/posts/default/8408369561440625421'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4500206154953507248/posts/default/8408369561440625421'/><link rel='alternate' type='text/html' href='http://tm-101.blogspot.com/2008/02/intro.html' title='Intro'/><author><name>Mitya</name><uri>http://www.blogger.com/profile/07764279855846194407</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='02101617867838857180'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry></feed>