Debugging tips for threaded programs on Solaris
By Chris Quenelle on Jun 01, 2007
Phil Harmon wrote a blog entry over a year ago ( Solaris Threads Tunables ) where he mentioned a list of tunable parameters that you can use to fiddle around with the implementation of Solaris libthread. You can fine tune the spin-lock timeouts, and other timing details. But one of the flags that he mentioned is NOT related to tuning libthread. It's more related to debugging your program! Someone on our internal dbx-interest alias asked why their program (which had a bug) was acting different when run under dbx, and the answer turns out to be related to a "sync tracking" flag that dbx turns on by default. It causes somewhat stricter checking of mutex bugs to be turned on.
Anyway, it turns out that if you set the environment variable _THREAD_ERROR_DETECTION to 1 or 2 you can get an extra level of error checking enabled inside libthread. 1 produces warning messages, and 2 produces warning messages and a core file for inspection.
The messages look like this:
\*\*\* _THREAD_ERROR_DETECTION: lock usage error detected \*\*\* mutex_lock(0x8047d50): calling thread already owns the lock calling thread is 0xfeea2000 thread-id 1
Most of the implementation is in libc/port/threads/assfail.c