The FBT provider, opensolaris source, and fun with the environment

The FBT provider, opensolaris source, and fun with the environment

Now that opensolaris is out there, it's quite a bit easier for folks to use DTrace's FBT provider. FBT provides "function boundary tracing", i.e. it has probes for the entry and exit for almost every function in the kernel. This is amazingly powerful and flexible, but it leads to it being hard to use: with over 20,000 functions on a typical Solaris box, it's very hard to know where to start, especially without access to the source code.

With OpenSolaris, the source code is available. So to illustrate how you can use this newly available information to get something done, I thought I'd use a classic question: How can I examine and muck with the environment?

To start off, we need a plan of attack; since dtrace doesn't have any way of looping over datastructures, typically if you want to walk some datastructure, you just find a place where the kernel is already doing so, and use probes there to sneak a peak at the data as it goes by. The initial environment for a process is set up in the exec(2) system call, so lets see if we can find where we read in the data from the user process.

Looking at the source, exece() calls exec_common(), which is the main workhorse. There doesn't seem to be any direct mucking of the environment, but there is:

    222 	ua.fname = fname;
    223 	ua.argp = argp;
    224 	ua.envp = envp;
    225 
    226 	if ((error = gexec(&vp, &ua, &args, NULL, 0, &execsz,
    227 	    exec_file, p->p_cred)) != 0) {
Now we could continue to track all of this down, but it's much easier to just search usr/src/uts for references to the envp symbol. That leads us to stk_copyin(), the routine responsible for copying in everything which is needed on the stack. Here's the environment processing loop:
   1303 	if (envp != NULL) {
   1304 		for (;;) {
   1305 			if (stk_getptr(args, envp, &sp))
   1306 				return (EFAULT);
   1307 			if (sp == NULL)
   1308 				break;
   1309 			if ((error = stk_add(args, sp, UIO_USERSPACE)) != 0)
   1310 				return (error);
   1311 			envp += ptrsize;
   1312 		}
   1313 	}
Without even knowing what stk_getptr() and stk_add() do, we now have enough information to write a D script to get the environment of a process as it exec()s. Here's the basic outline:
  1. First, use fbt::stk_copyin:entry to stash a copy of the envp pointer.
  2. Second, use fbt::stk_getptr:entry to watch for reads from the stored envp address.
  3. Third, use fbt::stk_add:entry and fbt::stk_add:return to print out the environment.
  4. Lastly, use fbt::stk_copyin:return to clean up.
And here's our first script:
#!/usr/sbin/dtrace -s

#pragma D option quiet

fbt::stk_copyin:entry
{
	self->envp = (uintptr_t)args[0]->envp;
}

fbt::stk_getptr:entry
/ self->envp != 0 /
{
	/\* check if we're looking at envp or envp+1 \*/
	self->on = ((arg1 - self->envp) <= sizeof (uint64_t));

	/\* update envp if we're on \*/
	self->envp = self->on ? arg1 : self->envp;
}

fbt::stk_add:entry
/ self->on && args[2] == UIO_USERSPACE /
{
	self->ptr = arg1;
}

fbt::stk_add:return
/ self->ptr != 0 /
{
	printf("%.79s\\n", copyinstr(self->ptr));
	self->ptr = 0;
	self->on = 0;
}

fbt::stk_copyin:return
{
	self->envp = 0;
	self->on = 0;
}
Note that we delay the copyinstr of stk_add()'s second argument until fbt::stk_add:return. This is due to the fact that dtrace(1M) cannot fault in pages; so if a probe tries to copyinstr an address which has not yet been touched, you'll get a runtime error like:
dtrace: error on enabled probe ID 4 (ID 12535: fbt:genunix:stk_add:entry):
invalid address (0x1818000) in action #1 at DIF offset 28
By waiting until the return probe, we avoid this problem; we know that the kernel just touched the page to read in its copy.

Now, looking at the environment is fun, but it would be even more interesting to change the environment of a process while it is being execed. This requires a bit more work, and access to the destructive action copyout(). I'm going to start with a script which requires a recent version of Solaris (snv_15+, or the OpenSolaris release), because Bryan introduced some nice string handling stuff recently. We'll adapt the script to S10 afterwards.

Lets start by saying we want to change the name of the environment variable "FOO" to "BAR", but leave the value the same. The basic idea is simple; copyin() the string in the fbt::stk_add:entry probe, and if it's the one we want to change, copyout() the changes. The kernel will then proceed to copyin() the changed string, and use it for the environment of the process. The complication is the same as before; what if the page hasn't yet been touched, or the copyout() operation fails (for example, if the string isn't writable)?

There's no simple solution, so I'm just going to check \*afterwards\* that we didn't miss changing it, and kill -9 the process if we did. It's vile, but effective. Here's the script:

#!/usr/sbin/dtrace -s

#pragma D option quiet
#pragma D option destructive

self uintptr_t ptr;

inline int interesting = (strtok(copyinstr(self->ptr), "=") == "FOO");

fbt::stk_copyin:entry
{
	self->envp = (uintptr_t)args[0]->envp;
}

fbt::stk_getptr:entry
/ self->envp != 0 /
{
	/\* check if we're looking at envp or envp+1 \*/
	self->on = ((arg1 - self->envp) <= sizeof (uint64_t));

	/\* update envp if we're on \*/
	self->envp = self->on ? arg1 : self->envp;
	self->ptr = 0;
}

fbt::stk_add:entry
/ self->on && args[2] == UIO_USERSPACE /
{
	self->ptr = arg1;
	self->didit = 0;
}

fbt::stk_add:entry
/ self->ptr != 0 && interesting /
{
	printf("%d: %s: changed env \\"%s\\"\\n",
	    pid, execname, copyinstr(self->ptr));
	copyout("BAR", self->ptr, 3);		/\* 3 == strlen("BAR") \*/
	self->didit = 1;
}

fbt::stk_add:return
/ self->ptr != 0 && interesting && !self->didit /
{
	printf("%d: %s: killed, env \\"%s\\" couldn't be changed\\n",
	    pid, execname, copyinstr(self->ptr));
	raise(9);
}

fbt::stk_copyin:return
{
	self->envp = 0;
	self->on = 0;
	self->ptr = 0;
}
The above works great on Solaris Nevada and OpenSolaris, but doesn't work on Solaris 10, because it uses "strtok". So to use it on Solaris 10, we'll have to do things slightly more manually. The only thing that needs to change is the definition of the "interesting" inline, and some more cleanup in fbt::stk_copyin:return:
inline int interesting =
    ((self->str = copyinstr(self->ptr), "=")[0] == 'F' &&
    self->str[1] == 'O' &&
    self->str[2] == 'O' &&
    self->str[3] == '=');
...
fbt::stk_copyin:return
{
	self->envp = 0;
	self->on = 0;
	self->ptr = 0;
	self->str = 0;
}
A final note is on stability; we're using private implementation details of Solaris to make this all work, and they are subject to change without notice at any time. This particular part of solaris isn't likely to change much, but you never know. A reasonable RFE would be for more Stable probe-points in the exec(3C) family, so that people can write things like this more stably.

Technorati Tag:
Technorati Tag:
Technorati Tag:

Comments:

Folks from the DTrace discussion forum adapted the above to the recent <tt>LD_AUDIT</tt> security hole; I've made my own version, which is robust and low-overhead:
#!/usr/sbin/dtrace -s

#pragma D option quiet
#pragma D option destructive

inline unsigned int SUGID = 0x00002000;
inline unsigned int SNOCD = 0x10000000;
inline unsigned int PSUIDFLAGS = SUGID|SNOCD;

self char \*str;

/\* catches more than LD_AUDIT{,_32,_64}, but who cares? \*/
inline int interesting = (self->str[8] = 0) == 0 &&
    (stringof(self->str) == "LD_AUDIT");

proc:::exec
{
	self->execname = args[0];
}

fbt::stk_add:entry
/ args[2] == UIO_USERSPACE /
{
	self->ptr = arg1;
}

fbt::stk_add:return
/ self->ptr != 0 &&
  (self->str = copyinstr(self->ptr)) != NULL && interesting /
{
	self->interesting = 1;
	self->report = copyinstr(self->ptr);
}

proc:::exec-success
/ self->interesting && (curthread->t_procp->p_flag & PSUIDFLAGS) /
{
	system(
	    "logger -pauth.crit \\"%d: %s execing setuid %s w/ '%s', killed\\"",
	    pid, execname, self->execname, self->report);
	raise(9);
}

proc:::exec-success, proc:::exec-failure
{
	self->ptr = 0;
	self->str = 0;
	self->interesting = 0;
	self->execname = 0;
	self->report = 0;
}
It logs to syslog every time it kills something.

Posted by Jonathan Adams on June 30, 2005 at 05:57 PM PDT #

Post a Comment:
  • HTML Syntax: NOT allowed
About

jwadams

Search

Top Tags
Categories
Archives
« July 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
  
       
Today