A simple lazy subscription pathology

Following up on The Pitfalls of lazy subscription, I thought I'd provide a simple case that illustrates where transactional lock elision (TLE) with lazy subscription can fail. I've managed to reproduce the failure in-house on an i7-4770 (haswell).

Say we have a ring-buffer that’s used for logging. It just records the last 16 messages posted. (This is kind of common for “black box flight recorders” that appear in kernels and the JVM). The memory layout is : int pos; intptr_t RingBuffer[16]; volatile int Lock. We'll also assume that we compile with code at low optimization levels, so accesses to "pos" result in loads and stores with no caching in registers. “Lock” is a simple test-and-test-and-set lock possibly augmented with TLE and late subscription.

The critical section is :



Post(m) :
Acquire(&Lock)
auto index = pos ++ // load and store pos
RingBuffer[index] = m
if pos == 16 : pos = 0 // reload pos
Release(&Lock)

A number of threads just loop, calling Post(0). Some run take the lock, others use TLE with late subscription. It's possible for one of threads using TLE with late subscription to observe an intermediate and transient "pos" value of 16 by virtue of some thread that holds "Lock" and has incremented "pos" from 15 to 16, but not yet reset the variable to 0. The thread using TLE that observed "pos" at 16 will then overwrite "Lock" with 0, so the late subscription check succeeds and the transaction inadvertently commits. This is a case of "wrongful commit" via lock corruption.

Characterizing which critical sections remain safe under TLE with late subscription remains an interesting research topic. But in the general case it's unsafe.

It's worth pointing out that for weakly ordered processors, the load or loads that constitute the classic early subscription check must have proper fencing. None is needed on x86, of course.

Comments:

Hey Dave,

From your example, it would seem that the issues with lazy subscription are rooted in a TLE execution viewing the inconsistent state of a lock execution that was intended to be atomic. Is it true that these issues could be mitigated by store buffering the lock execution? Publishing only the consistent state just before releasing the lock? Forgive me if this already covered in your papers or is plain wrong. :)

Posted by Tim Merrifield on August 06, 2014 at 06:25 PM EDT #

Hi Tim,

If you were to buffer, I think you'd need all the writes to appear atomically. IIRC the IBM power8 has "ROT" -- rollback only transactions -- the can buffer writes, but I don't think they guarantee all the writes are made visible as a unit. I think Adam Morrison et al had a paper in WTTM in PODC about using ROT.

Regards, -Dave

Posted by guest on August 07, 2014 at 12:14 PM EDT #

Post a Comment:
  • HTML Syntax: NOT allowed
About

Dave is a senior research scientist in the Scalable Synchronization Research Group within Oracle Labs : Google Scholar.

Search

Categories
Archives
« July 2015
SunMonTueWedThuFriSat
   
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
 
       
Today