I've been working for ages on how to resolve an issue reported as
automountd hangs when using executeable automount maps. This is logged as bug 4522909
The problem that when the automountd attempts to do a mount, it triggers a lookup on the mountpoint. This is done by another thread in the automountd. While we're waiting for that to complete we call auto_wait4mount which in turn blocks all signals by calling signintr(). This also makes the thread unstoppable by incrementing lwp_nostop.
As this is an executeable map, the "other thread" has to fork1() in order to run the map. This in turn tries to STOP all threads in the process to get them in to a known state before forking the 1 thread we care about. As the mount thread is unstoppable this never completes.
The fix was to allow a thread to be stopped, even if lwp_nostop is set, if the thread stopping it is in the same process. However this has show up a couple of mistaken assumptions in NFS land which mean that more work is needed there to allow an RFS call to be restarted (6306343
This is obviously a bit of a pain as the changes required to fix 6306343
may be quite large and require a change that is too risky for a patch so an alternative approach is needed.
After a good amount of discussion we concluded the lowest risk solution was to create a door server for the automountd process to talk to. This door server would handle all requests to fork the process to do an exec. Obviously as the lookup is handled in the main automountd process the deadlock is avoided.
Now if you read the door_call man page it talks extensively about attaching to the door_server file descriptor as there is an assumption that the fd is somewhere in the file name space. If it is then anyone with sufficient privilege can write to it and you can end up with all sorts of rubbish written in. However if you fork() a child inherits the fs'd of the parent so the simple sollution is to have the automountd process set up the door_server() itself before it becomes multithreaded and then create a child to behave like the old automountd did. But with calls to the door_server to get the fork()/exec() stuff to work. Hence you get two automountd processes.
Unfortunately this is all in the patch releases as it isn't needed for OpenSolaris, so you'll never see my excellent code, but I still thought it was worth writing it up.
I'll fill in the patch versions as they come out So far we have
108994-56 SunOS 5.8_x86: LDAP2 client, libc, libthread and libnsl libraries patch
108993-56 SunOS 5.8: LDAP2 client, libc, libthread and libnsl libraries patch
117468-12 SunOS 5.9_x86: nfs patch
113318-26 SunOS 5.9: nfs patch
T118833-18 SunOS 5.10: Kernel Update
T118855-15 SunOS 5.10_x86: Kernel Update
Technorati Tag: OpenSolaris
Technorati Tag: Solaris