Anatomy of an exploit: CVE-2010-3081

It has been an exciting week for most people running 64-bit Linux systems. Shortly after "Ac1dB1tch3z" released his or her exploit of the vulnerability known as CVE-2010-3081, we saw this exploit aggressively compromising machines, with reports of compromises all over the hosting industry and many machines using our diagnostic tool and testing positive for the backdoors left by the exploit.

The talk around the exploit has mostly been panic and mitigation, though, so now that people have had time to patch their machines and triage their compromised systems, what I'd like to do for you today is talk about how this bug worked, how the exploit worked, and what we can learn about Linux security.

The Ingredients of an Exploit

There are three basic ingredients that typically go into a kernel exploit: the bug, the target, and the payload. The exploit triggers the bug -- a flaw in the kernel -- to write evil data corrupting the target, which is some kernel data structure. Then it prods the kernel to look at that evil data and follow it to run the payload, a snippet of code that gives the exploit the run of the system.

The bug is the one ingredient that is unique to a particular vulnerability. The target and the payload may be reused by an attacker in exploits for other vulnerabilities -- if 'Ac1dB1tch3z' didn't copy them already from an earlier exploit, by himself or by someone else, he or she will probably reuse them in future exploits.

Let's look at each of these in more detail.

The Bug: CVE-2010-3081

An exploit starts with a bug, or vulnerability, some kernel flaw that allows a malicious user to make a mess -- to write onto its target in the kernel. This bug is called CVE-2010-3081, and it allows a user to write a handful of words into memory almost anywhere in the kernel.

The bug was present in Linux's 'compat' subsystem, which is used on 64-bit systems to maintain compatibility with 32-bit binaries by providing all the system calls in 32-bit form. Now Linux has over 300 different system calls, so this was a big job. The Linux developers made certain choices in order to keep the task manageable:

  • We don't want to rewrite the code that actually does the work of each system call, so instead we have a little wrapper function for compat mode.
  • The wrapper function needs to take arguments from userspace in 32-bit form, then put them in 64-bit form to pass to the code that does the system call's work. Often some arguments are structs which are laid out differently in the 32-bit and 64-bit worlds, so we have to make a new 64-bit struct based on the user's 32-bit struct.
  • The code that does the work expects to find the struct in the user's address space, so we have to put ours there. Where in userspace can we find space without stepping on toes? The compat subsystem provides a function to find it on the user's stack.
Now, here's the core problem. That allocation routine went like this:
  static inline void __user *compat_alloc_user_space(long len)
  {
          struct pt_regs *regs = task_pt_regs(current);
          return (void __user *)regs->sp - len;
  }
The way you use it looks a lot like the old familiar malloc(), or the kernel's kmalloc(), or any number of other memory-allocation routines: you pass in the number of bytes you need, and it returns a pointer where you are supposed to read and write that many bytes to your heart's content. But it comes -- came -- with a special catch, and it's a big one: before you used that memory, you had to check that it was actually OK for the user to use that memory, with the kernel's access_ok() function. If you've ever helped maintain a large piece of software, you know it's inevitable that someone will eventually be fooled by the analogy, miss the incongruence, and forget that check.

Fortunately the kernel developers are smart and careful people, and they defied that inevitability almost everywhere. Unfortunately, they missed it in at least two places. One of those is this bug. If we call getsockopt() in 32-bit fashion on the socket that represents a network connection over IP, and pass an optname of MCAST_MSFILTER, then in a 64-bit kernel we end up in compat_mc_getsockopt():

  int compat_mc_getsockopt(struct sock *sock, int level, int optname,
          char __user *optval, int __user *optlen,
          int (*getsockopt)(struct sock *,int,int,char __user *,int __user *))
  {
This function calls compat_alloc_user_space(), and it fails to check the result is OK for the user to access -- and by happenstance the struct it's making room for has a variable length, supplied by the user.

So the attacker's strategy goes like so:

  • Make an IP socket in a 32-bit process, and call getsockopt() on it with optname MCAST_MSFILTER. Pass in a giant length value, almost the full possible 2GB. Because compat_alloc_user_space() finds space by just subtracting the length from the user's stack pointer, with a giant length the address wraps around, down past zero, to where the kernel lives at the top of the address space.
  • When the bug fires, the kernel will copy the original struct, which the attacker provides, into the space it has just 'allocated', starting at that address up in kernel-land. So fill that struct with, say, an address for evil code.
  • Tune the length value so that the address where the 'new struct' lives is a particularly interesting object in the kernel, a target.
The fix for CVE-2010-3081 was to make compat_alloc_user_space() call access_ok() to check for itself.

More technical details are ably explained in the original report by security researcher Ben Hawkes, who brought the vulnerability to light.

The Target: Function Pointers Everywhere

The target is some place in the kernel where if we make the right mess, we can leverage that into the kernel running the attacker's code, the payload. Now the kernel is full of function pointers, because secretly it's object oriented. So for example the attacker may poke some userspace object like a special file to cause the kernel to invoke a certain method on it -- and before doing so will target that method's function pointer in the object's virtual method table (called an "ops struct" in kernel lingo) which says where to find all the methods, scribbling over it with the address of the payload.

A key constraint for the attacker is to pick something that will never be used in normal operation, so that nothing goes awry to catch the user's attention. This exploit uses one of three targets: the interrupt descriptor table, timer_list_fops, and the LSM subsystem.

  • The interrupt descriptor table (IDT) is morally a big table of function pointers. When an interrupt happens, the hardware looks it up in the IDT, which the kernel has set up in advance, and calls the handler function it finds there. It's more complicated than that because each entry in the table also needs some metadata to say who's allowed to invoke the interrupt, whether the handler should be called with user or kernel privileges, etc. This exploit picks interrupt number 221, higher than anybody normally uses, and carefully sets up that entry in the IDT so that its own evil code is the handler and runs in kernel mode. Then with the single instruction int $221, it makes that interrupt happen.
  • timer_list_fops is the "ops struct" or virtual method table for a special file called /proc/timer_list. Like many other special files that make up the proc filesystem, /proc/timer_list exists to provide kernel information to userspace. This exploit scribbles on the pointer for the poll method, which is normally not even provided for this file (so it inherits a generic behavior), and which nobody ever uses. Then it just opens that file and calls poll(). I believe this could just as well have been almost any file in /proc/.
  • The LSM approach attacks several different ops structs of type security_operations, the tables of methods for different 'Linux security modules'. These are gigantic structs with hundreds of function pointers; the one the exploit targets in each struct is msg_queue_msgctl, the 100th one. Then it issues a msgctl system call, which causes the kernel to check whether it's authorized by calling the msg_queue_msgctl method... which is now the exploit's code.
Why three different targets? One is enough, right? The answer is flexibility. Some kernels don't have timer_list_fops. Some kernels have it, but don't make available a symbol to find its address, and the address will vary from kernel to kernel, so it's tricky to find. Other kernels pose the same obstacle with the security_operations structs, or use a different security_operations than the ones the exploit corrupts. Different kernels offer different targets, so a widely applicable exploit has to have several targets in its repertoire. This one picks and chooses which one to use depending on what it can find.

The Payload: Steal Privileges

Finally, once the bug is used to corrupt the target and the target is triggered, the kernel runs the attacker's payload, or shellcode. A simple exploit will run the bare minimum of code inside the kernel, because it's much easier to write code that can run in userspace than in kernelspace -- so it just sets the process up to have the run of the system, and then returns.

This means setting the process's user ID to 0, root, so that everything else it does is with root privileges. A process's user ID is stored in different places in different kernel versions -- the system became more complicated in 2.6.29, and again in 2.6.30 -- so the exploit needs to have flexibility again. This one checks the version with uname and assembles the payload accordingly.

This exploit can also clear a couple of flags to turn off SELinux, with code it optionally includes in the payload -- more flexibility. Then it lets the kernel return to userspace, and starts a root shell.

In a real attack, that root shell might be used to replace key system binaries, steal data, start a botnet daemon, or install backdoors on disk to cement the attacker's control and hide their presence.

Flexibility, or, You Can't Trust a Failing Exploit

All the points of flexibility in this exploit illustrate a key lesson: you can't determine you're vulnerable just because an exploit fails. For example, on a Fedora 13 system, this exploit errors out with a message like this:
  $ ./ABftw
  Ac1dB1tCh3z VS Linux kernel 2.6 kernel 0d4y
  $$$ Kallsyms +r
  $$$ K3rn3l r3l3as3: 2.6.34.6-54.fc13.i686
  [...]
  !!! Err0r 1n s3tt1ng cr3d sh3llc0d3z 
Sometimes a system administrator sees an exploit fail like that and concludes they're safe. "Oh, Red Hat / Debian / my vendor says I'm vulnerable", they may say. "But the exploit doesn't work, so they're just making stuff up, right?"

Unfortunately, this can be a fatal mistake. In fact, the machine above is vulnerable. The error message only comes about because the exploit can't find the symbol per_cpu__current_task, whose value it needs in the payload; it's the address at which to find the kernel's main per-process data structure, the task_struct. But a skilled attacker can find the task_struct without that symbol, by following pointers from other known data structures in the kernel.

In general, there is almost infinitely much work an exploit writer could put in to make the exploit function on more and more kernels. Use a wider repertoire of targets; find missing symbols by following pointers or by pattern-matching in the kernel; find missing symbols by brute force, with a table prepared in advance; disable SELinux, as this exploit does, or grsecurity; or add special code to navigate the data structures of unusual kernels like OpenVZ. If the bug is there in a kernel but the exploit breaks, it's only a matter of work or more work to extend the exploit to function there too.

That's why the only way to know that a given kernel is not affected by a vulnerability is a careful examination of the bug against the kernel's source code and configuration, and never to rely on a failing exploit -- and even that examination can sometimes be mistakenly optimistic. In practice, for a busy system administrator this means that when the vendor recommends you update, the only safe choice is to update.

~price

Comments:

Post a Comment:
  • HTML Syntax: NOT allowed
About

Tired of rebooting to update systems? So are we -- which is why we invented Ksplice, technology that lets you update the Linux kernel without rebooting. It's currently available as part of Oracle Linux Premier Support, Fedora, and Ubuntu desktop. This blog is our place to ramble about technical topics that we (and hopefully you) think are interesting.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today