Malthusian Locks

Malthusian Locks appears in EuroSys 2017. An extended version is in arxiv.Abstract: Applications running in modern multithreaded environments are sometimes overthreaded. The excess threads do not improve performance, and in fact may act to degrade performance via scalability collapse, which can manifest even when there are fewer ready threads than available cores. Often, such software also has highly contended locks. We leverage the existence of such locks by modifying the...

Wednesday, March 15, 2017 | General | Read More

Transactional Pointers: Experiences with HTM-Based Reference Counting in C++

Transactional Pointers: Experiences with HTM-Based Reference Counting in C++ by Maria Carpen-Amarie , Dave Dice, Gaƫl Thomas and Pascal Felber appeared in NETYS 2016. Apologies -- the paper is paywalled.AbstractThe most popular programming languages, such as C++ or Java, have libraries and data structures designed to automatically address concurrency hazards in order to run on multiple threads. In particular, this trend has also been adopted in the memory management domain....

Tuesday, September 20, 2016 | General | Read More

An Inflatable SeqLock

The following is inspired by some discussions with John Rose on how we might implement atomic access to value type "tuples" for subsequent JDK releases. As I understand it, the basic idea is to have something close C++'s atomic<POD> facility. Since reading (loading) a value type is expected to be the dominant mode of access, we'd like to use something like SeqLocks so that readers don't have to write to any shared locations. (See also). But at the same time we'd like writers...

Wednesday, August 10, 2016 | General | Read More

Mitigating the Java nanoTime coherence hotspot

Java's nanoTime() API guarantees a monotonic (really, non-retrograde) relative clock source. It's also expected to be causal in the following sense. Say thread A calls nanoTime(), gets value V, and then stores V into memory. Some other thread B then observes A's store of V, then calls nanoTime() itself and gets W. We expect W should be greater than or equal to V.In an ideal world the clock sources underlying nanoTime() would be causal. But that's not always the case on all...

Monday, July 18, 2016 | General | Read More

Preemption tolerant MCS locks

A simple test-and-set based spin lock is a reasonable choice when contention is nil or low. Lock handover is relatively efficient and there's no need to maintain a list of waiting threads, yielding a simple design. Under higher contention, queue-based such as MCS often provide better throughput and be a better choice. Classic MCS locks are strictly FIFO queue-based locks that use local spinning. (Clever lock implementations can try to adaptively select the best algorithm...

Saturday, July 16, 2016 | General | Read More

Fast Non-intrusive Memory Reclamation for Highly-Concurrent Data Structures

Fast Non-intrusive Memory Reclamation for Highly-Concurrent Data Structures by Dave Dice, Maurice Herlihy and Alex Kogan, in ISMM 2016Abstract:Current memory reclamation mechanisms for highly-concurrent data structures present an awkward trade-off. Techniques such as epoch-based reclamation perform well when all threads are running on dedicated processors, but the delay or failure of a single thread will prevent any other thread from reclaiming memory. Alternatives such as...

Wednesday, May 4, 2016 | General | Read More

Integrated Cloud Applications & Platform Services