The (Secret) Agent LWP

One of the most powerful but least understood aspects of the Solaris /proc implementation is what's known as the 'agent lwp'. The agent is a special thread that can be created on-demand by external processes. There are a few limitations: only one agent thread can exist in a process, the process must be fully stopped, and the agent cannot perform fork(2) or exec(2) calls. See proc(4) for all the gory details. So what's its purpose?

Consider the pfiles command. It's pretty easy to get the number of file descriptors for the process, and it's pretty easy to get their path information (in Solaris 10). But there's a lot of information there that can only be found through stat(2), fcntl(2), or getsockopt(3SOCKET). In this situation, we have generally three choices:

  1. Create a new system call. System calls are fast, but there aren't many of them, and they're generally reserved for something reasonably important. Not to mention the duplicated code and hairy locking problems.
  2. Expose the necessary information through /proc. This is marginally better than the above. We still have to write a good chunk of kernel code, but we don't have to dedicate a system call for it. On the other hand, we have to expose a lot of information through /proc, which means it's a public interface and will have to be supported for eternity. This is only done when we believe the information is useful to developers at large.
  3. Using the agent lwp, execute the necessary system call in the context of the controlled process.

For debugging utilities and various tools that are not performance critical, we typically opt for the third option above. Using the agent LWP and a borrowed stack, we do the following: First, we reserve enough stack space for all the arguments and throw in the necessary syscall intructions. We use the trace facilities of /proc to set the process running and wait for it to hit the syscall entry point. We then copy in our arguments, and wait for it to hit the syscall exit point. We then extract any altered values that we may need, clean up after ourselves, and get the return value of the system call.

If all of this sounds complicated, it's because it is. When you throw everything into the mix, it's about 450 lines of code to perform a basic system call, with many subtle factors to consider. To make our lives easier, we have created libproc, which includes a generic function to make any system call. libproc is extremely powerful, and provides many useful functions for dealing with ELF files and the often confusing semantics of /proc. Things like stepping over breakpoints and watchpoints can be extremely tricky when using the raw proc(4) interfaces. Unfortunately, the libproc APIs are private to Sun. Hopefully, one of my first tasks after the Solaris 10 crunch will be to clean up this library and present it to the world.

There are those among us who have other nefarious plans for the agent LWP. It's a powerful tool that I find interesting (and sometimes scary). Hopefully we can make it more accesible in the near future.

Comments:

Hi there, I've been reading quite a few post by you and other solaris kernel developers. Overall - quite an interesting reading with a bit of injected opinions, but that's ok. One thing, that strikes me is the widespred usage of non-orthogonal (dirty? ;-)) solutions by you guys. The other day I was reading a DTrace paper. And there you say that you had to teach alot of unrelated subsystems (like linker about calls to __dtrace-prefixed functions or paging about D virtual machine) about DTrace to make the whole thing work. In this post you are talking about "agent LWP", then "throw in the necessary syscall intructions" and "set the process running". This whole approach looks very fragile to me. I would really like to hear what you guys think about this. Maybe you could make a post where you tell us about orthogonality & consistency in solaris kernel design. Thanks, -boris.

Posted by boris on June 29, 2004 at 06:26 AM PDT #

Boris - good points for sure. The answer is that it's not fragile, just difficult, and it requires a great deal of knowledge about the inner workings of the system. That's why we like to do the dirty work for you. In this particular case the dirty work (encapsulated in libproc) is still private to Sun, but we'll work on getting that public real-soon-now. Certainly some of the basic interfaces (/proc, /dev/kmem, device ioctls) are difficult to use, so we build abstractions on top of them. DTrace, for example, is an incredibly complex system, but Bryan, Mike, and Adam have put a lot of work into making sure it all "just works". You may not _need_ to know how statically defined DTrace probes actually work, but who's to say it's not interesting ;-)

Posted by Eric Schrock on June 29, 2004 at 08:53 AM PDT #

Boris, you raise two interesting points which I'd like to address briefly. You mention that DTrace, for example, has its hooks into some seemingly orthogonal parts of the system. While that's true in some limited examples, we try to minimize the pollution of unrelated subsystems. That said, we also take advantage of the fact that we _can_, for example, modify the kernel runtime linker to add some DTrace magic or add probe points to libc so you can trace user-land locking behavior (more on that in my blog soon). Your second point is about fragility. These systems, DTrace and the /proc agent, aren't fragile, but, as Eric mentions, their implementation can be complex. The agent LWP, for example, involved some subtle implementation details, but allows for literally arbitrary manipulation of user-process state from an external debugger. In this case, the complexity is far outweighed by the benefits.

Posted by Adam Leventhal on June 30, 2004 at 02:40 PM PDT #

Eric and Adam have already addressed your point more than sufficiently, but I can't help responding to one of your assertions: it's not at all accurate to say that we had to "teach alot [sic] of unrelated subsystems" about DTrace. Quite the contrary: thanks to some well placed knowledge in a few key subsystems (like the linker), we can instrument text that we've never seen before -- allowing for a much more robust system, not a more fragile one...

Posted by Bryan Cantrill on June 30, 2004 at 04:11 PM PDT #

Eric, Adam, Bryan - thanks for your replies. When I said fragile I didn't mean that the code doesn't behave robustly from the user perspective - I have no doubts that you are capable of tackling things there. I was talking more from the design perspective. One of the benefits of orthogonality and consistency in the design is that it is relatively easy to dissect and study parts in isolation. And I think that's exactly why the design principals of DTrace won't work let's say for linux kernel - cleanness (in conceptual sense) of the code is first priority since there are hundreds of people working on it. I think it's a good idea to start thinking about such issues since you are planning to free solaris code in hope that people will help develop it.

Posted by boris on July 01, 2004 at 01:21 AM PDT #

Whoever came up with this idea of agent LWP needs to get a raise. The only problem is deducing how one actually creates this agent LWP and access it. After applying RTFM to proc(4) manpage it seems like one sends control messages (PCAGENT) to an individual lwp's lwpctl file by means of write(). So, we have an idea how to create and invoke the agent LWP, but what code will the agent be running? Was this existing code from a running process. Could you provide a bare skeleton example? I'm apparently missing a simple point here. Thanks in advance.. D.

Posted by David Lange on July 27, 2004 at 02:12 AM PDT #

Yes, the agent LWP is definitely hard to use. Internally, we use libproc to do all the dirty work, which we intend to publicize sometime post-S10 for general consumption.

In the meantime, I'll put together an example for a future blog post.

Posted by Eric Schrock on July 27, 2004 at 07:37 AM PDT #

... even if we can't get libproc out the door soon, you'll have the libproc source code in your hands pretty soon after we open source Solaris. Stay tuned.

Posted by Adam Leventhal on August 10, 2004 at 12:42 PM PDT #

Post a Comment:
Comments are closed for this entry.
About

Musings about Fishworks, Operating Systems, and the software that runs on them.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today