Following up on The Pitfalls of lazy subscription
, I thought I'd provide a simple case that illustrates where transactional lock elision (TLE) with lazy subscription can fail. I've managed to reproduce the failure in-house on an i7-4770 (haswell).
Say we have a ring-buffer that’s used for logging. It just records the last 16 messages posted. (This is kind of common for “black box flight recorders” that appear in kernels and the JVM). The memory layout is : int pos; intptr_t RingBuffer; volatile int Lock. We'll also assume that we compile with code at low optimization levels, so accesses to "pos" result in loads and stores with no caching in registers. “Lock” is a simple test-and-test-and-set lock possibly augmented with TLE and late subscription.
The critical section is :
auto index = pos ++ // load and store pos
RingBuffer[index] = m
if pos == 16 : pos = 0 // reload pos
A number of threads just loop, calling Post(0). Some run take the lock, others use TLE with late subscription. It's possible for one of threads using TLE with late subscription to observe an intermediate and transient "pos" value of 16 by virtue of some thread that holds "Lock" and has incremented "pos" from 15 to 16, but not yet reset the variable to 0. The thread using TLE that observed "pos" at 16 will then overwrite "Lock" with 0, so the late subscription check succeeds and the transaction inadvertently commits. This is a case of "wrongful commit" via lock corruption.
Characterizing which critical sections remain safe under TLE with late subscription remains an interesting research topic. But in the general case it's unsafe.
It's worth pointing out that for weakly ordered processors, the load or loads that constitute the classic early subscription check must have proper fencing. None is needed on x86, of course.