When to add a membar (an example)

I was recently having a discussion on one of the OpenSolaris lists on the topic of when to use the volatile keyword, and when it was necessary to use membars.

So volatile is a clue to the compiler to load the variable from memory and immediately store it back to memory. What it does not do is to tell the hardware anything. So the application can perform the store, but that store may not be immediately visible to the rest of the system. Most of the time this is not a problem - so long as the store is visible to the processor on which the thread is executing it's fine. Variability of when the store is visible to other processors may also be fine. There is one clear situation where the ordering of store operations could be a problem - and that's unlocking mutexes.

The problem here is best illustrated by the following scenario. I lock some data structure, then store new values into it, then unlock the structure. Immediately another thread comes along and uses the values in that structure. Not an uncommon situation. Unlocking a mutex is often just a case of storing a value (of zero) into the mutex structure. And here's the potential problem. In some weaker ordering architectures there is no guarantee that other processors see the stores in the same order that they are performed. So if you have Store A followed by Store B it may be possible for other processors to observe the change in the value of B before they see the change in the value of A. In the case of mutex unlock, the store of B would be the action that unlocked the mutex, enabling other threads to access the variable A... and there could be problems if they see the old value of A.

The solution to this is to put a membar in before the store that unlocks the mutex. You can see this happening in the OpenSolaris code:

     41 /\*
     42  \* lock_clear(lp)
     43  \*	- clear lock.
     44  \*/
     45 	ENTRY(_lock_clear)
     46 	membar	#LoadStore|#StoreStore
     47 	retl
     48 	  clrb	[%o0]
     49 	SET_SIZE(_lock_clear)

The membar ensures that all the pending stores are visible to other processors before the store that releases the lock becomes visible to them.

Comments:

Why is clrb used instead of sub %o0, %o0?

I always thought that sub is faster?

Posted by UX-admin on March 27, 2009 at 03:44 AM PDT #

The clrb is actually stb %g0,[%o0]. It's clearing the byte at that address, not clearing the register. That would be mov %g0,%o0, or as you suggest sub %o0,%o0. Generally on SPARC, there's some instructions (like clr) which are basically alternative names for particular versions of more general instructions.

Regards,

Darryl.

Posted by Darryl Gove on March 27, 2009 at 03:52 AM PDT #

Post a Comment:
Comments are closed for this entry.
About

Darryl Gove is a senior engineer in the Solaris Studio team, working on optimising applications and benchmarks for current and future processors. He is also the author of the books:
Multicore Application Programming
Solaris Application Programming
The Developer's Edge
Free Download

Search

Categories
Archives
« March 2015
SunMonTueWedThuFriSat
1
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
22
23
24
26
27
28
29
30
31
    
       
Today
Bookmarks
The Developer's Edge
Solaris Application Programming
Publications
Webcasts
Presentations
OpenSPARC Book
Multicore Application Programming
Docs