Solaris ufs bug in Month of Kernel Bugs

Just noticed that Solaris has an entry in Month of Kernel bugs.

While I agree that we have an issue that needs looking at, I also believe that the contributor is making much more of it than it really deserves.

First off, to paraphrase the issue:

If I give you a specially massaged filesystem and can convince someone with the appropriate privilege to mount it, it will crash the system.

I'd hardly call this a "denial of service", let alone exploitable.

First off, in order to perform a mount operation of a ufs filesystem, you need sys_mount privilege. In Solaris, we currently are runing under the concept of "least privilege". That is, a process is given the least amount of privilege that it needs to run. So, in order to exploit this you need to convince someone with the appropriate level of privilege to mount your filesystem. This would also invlove a bit of social engineering which went unmentioned.

That being said, they system should not panic off this filesystem and I will log a bug to this effect. It is a shame that the contributor did not make the crashdump files available as it would certainly speed up any analysis.

One other thing that I should add is that anyone who tries to mount an unknown ufs filesystem without at least running "fsck -n" over it probably deserves what they get.

OK, I have copied it to a relatively current nevada system and mounted it as /dev/lofi/1. On running "fsck -n" we see:

\*\* /dev/rlofi/1 (NO WRITE)
BAD SUPERBLOCK AT BLOCK 16: BAD VALUES IN SUPER BLOCK

LOOK FOR ALTERNATE SUPERBLOCKS WITH MKFS?  no


LOOK FOR ALTERNATE SUPERBLOCKS WITH NEWFS?  no

SEARCH FOR ALTERNATE SUPERBLOCKS FAILED.

USE GENERIC SUPERBLOCK FROM MKFS?  no


USE GENERIC SUPERBLOCK FROM NEWFS?  no

SEARCH FOR ALTERNATE SUPERBLOCKS FAILED. YOU MUST USE THE -o b OPTION
TO FSCK TO SPECIFY THE LOCATION OF A VALID ALTERNATE SUPERBLOCK TO
SUPPLY NEEDED INFORMATION; SEE fsck(1M).

In the normal course of events, would you mount this filesystem. I certainly would not. This however is not the normal course of events and I'm playing on a lab system.

v40z-c# uname -a
SunOS v40z-c 5.11 snv_46 i86pc i386 i86pc

Let's try a read only mount first.

v40z-c# mount -r /dev/lofi/1 /mnt
v40z-c# ls /mnt
lost+found
v40z-c# umount /mnt

OK, the read only mount is fine. Now the read/write, ... Bingo

v40z-c# mount /dev/lofi/1 /mnt

panic[cpu3]/thread=ffffffff9a6974e0: BAD TRAP: type=e (#pf Page fault) rp=fffffe8000c7f2c0 addr=fffffe80fe39d6c4

mount: #pf Page fault
Bad kernel fault at addr=0xfffffe80fe39d6c4
pid=2170, pc=0xfffffffffbb70950, sp=0xfffffe8000c7f3b0, eflags=0x10286
cr0: 8005003b cr4: 6f8
cr2: fffffe80fe39d6c4 cr3: 1ff76b000 cr8: c
	...
fffffe8000c7f1b0 unix:die+b1 ()
fffffe8000c7f2b0 unix:trap+1528 ()
fffffe8000c7f2c0 unix:_cmntrap+140 ()
fffffe8000c7f440 ufs:alloccgblk+42f ()
fffffe8000c7f4e0 ufs:alloccg+473 ()
fffffe8000c7f560 ufs:hashalloc+50 ()
fffffe8000c7f600 ufs:alloc+14f ()
fffffe8000c7f6c0 ufs:lufs_alloc+f3 ()
fffffe8000c7f770 ufs:lufs_enable+261 ()
fffffe8000c7f7e0 ufs:ufs_fiologenable+63 ()
fffffe8000c7fd60 ufs:ufs_ioctl+3e0 ()
fffffe8000c7fdc0 genunix:fop_ioctl+3b ()
fffffe8000c7fec0 genunix:ioctl+180 ()
fffffe8000c7ff10 unix:sys_syscall32+101 ()

OK, so we should now have a crashdump to look at.

While the machine is rebooting, it occurs to me that if we put this ufs onto an external USB device, we might actually have an exploitable issue here, once the new hal/rmvolmgr framework is in place (nv_51) if we try to automatically mount ufs devices.

core file:      /var/crash/v40z-c/vmcore.0
release:        5.11 (64-bit)
version:        snv_46
machine:        i86pc
node name:      v40z-c
domain:         aus.cte.sun.com
system type:    i86pc
hostid:         69e47dae
dump_conflags:  0x10000 (DUMP_KERNEL) on /dev/dsk/c1t1d0s1(517M)
time of crash:  Sun Nov 12 13:08:18 EST 2006
age of system:  34 days 1 hours 42 minutes 34.95 seconds
panic CPU:      3 (4 CPUs, 7.56G memory)
panic string:   BAD TRAP: type=e (#pf Page fault) rp=fffffe8000c7f2c0 addr=fffffe80fe39d6c4

sanity checks: settings...vmem...sysent...clock...misc...done

-- panic trap data  type: 0xe (Page fault)
  addr: 0xfffffe80fe39d6c4  rp: 0xfffffe8000c7f2c0
  savfp 0xfffffe8000c7f440  savpc 0xfffffffffbb70950
  %rbp  0xfffffe8000c7f440  %rsp  0xfffffe8000c7f3b0
  %rip  0xfffffffffbb70950  (ufs:alloccgblk+0x42f)

  0%rdi 0xffffffff8d60b000  1%rsi 0xffffffff8930c308  2%rdx               0xb5
  3%rcx               0xb5  4%r8  0xfffffe80fe39d6c0  5%r9              0x12f0

  %rax                 0x8  %rbx          0x361005a8
  %r10                   0  %r11  0xfffffffffbcd9ff0  %r12               0x5a8
  %r13  0xffffffff8930c000  %r14  0xffffffff8d60b000  %r15  0xffffffff99656c00
  %cs       0x28 (KCS_SEL)
  %ds       0x43 (UDS_SEL)
  %es       0x43 (UDS_SEL)
  %fs          0 (KFS_SEL)
  %gs      0x1c3 (LWPGS_SEL)
  %ss       0x30 (KDS_SEL)
  trapno     0xe (Page fault)
  err        0x2 (page not present,write,supervisor)
  %rfl   0x10286 (parity|negative|interrupt enable|resume)
  fsbase 0xffffffff80000000 gsbase 0xffffffff8901c800
ufs:alloccgblk+0x42f()
ufs:alloccg+0x473()
ufs:hashalloc+0x50()
ufs:alloc+0x14f()
ufs:lufs_alloc+0xf3()
ufs:lufs_enable+0x261()
ufs:ufs_fiologenable+0x63()
ufs:ufs_ioctl+0x3e0()
genunix:fop_ioctl+0x3b()
genunix:ioctl+0x180()
unix:_syscall32_save+0xbf()
-- switch to user thread's user stack --

The trap has occurred in alloccgblk+42f() in the ufs code.

ufs:alloccgblk+0x410            call   +0xe0a4  (ufs:clrblock+0x0)
ufs:alloccgblk+0x415            decl   0x1c(%r13)	; cgp->cg_cs.cs_nbfree--
ufs:alloccgblk+0x419            decl   0xc4(%r14)	; fs->fs_cstotal.cs_nbfree--
ufs:alloccgblk+0x420            movslq 0xc(%r13),%r8	; %r8 <- cgp->cg_cgx
ufs:alloccgblk+0x428            addq   0x2d8(%r14),%r8	; %r8 <- fs->fs_u.fs_csp[%r8]
ufs:alloccgblk+0x42f            decl   0x4(%r8) <-- panic here

We've just made a call to ufs:clrblock() and are decrementing something after a long list of pointer dereferencing. We only call clrblock() once in this routine, so that puts us at:

   428  #define fs_cs(fs, indx) fs_u.fs_csp[(indx)]

  1238          clrblock(fs, blksfree, (long)blkno);
  1239          /\*
  1240           \* the other cg/sb/si fields are TRANS'ed by the caller
  1241           \*/
  1242          cgp->cg_cs.cs_nbfree--;
  1243          fs->fs_cstotal.cs_nbfree--;
  1244          fs->fs_cs(fs, cgp->cg_cgx).cs_nbfree--;

Panicing on line 1244.

I should note at this point that the source I am quoting is from the same source tree as opensolaris.

After the macro expansion, it becomes

  1244          fs->fs_u.fs_csp[cgp->cg_cgx]--

So what is cgp->cg_cgx?

SolarisCAT(vmcore.0/11X)> sdump 0xffffffff8930c000 cg cg_cgx
   cg_cgx = 0x6f0000

This is probably a trifle on the largish side, which would explain how we have ended up in unmapped memory.

The address we end up with for the (struct csum \*) is 0xfffffe80fe39d6c0

If we go back to look at fs->fs_ncg, we see that there were only two cylinder groups allocated. We have an obvious inconsistancy.

Also, interestingly, this is not dieing in the mount(2) system call. It's dieing in a subsequent ioctl(2). This ioctl appears to be the one enabling ufs logging.

So how might we handle this?

Now, as the filesystem is already mounted and we are in a subsequent ioctl(), we can't fail the mount. Could we fail in alloccgblock() and have the error propogate back up to the process making the ioctl()?

Walking back up the stack, we see that alloccg() handles alloccgblk() returning 0. Now if alloccg() returns 0 to hashalloc(), in this instance, we'll first try to do a quadratic rehash, which will also fail, so we'll fall through to the brute force search. As this starts at cylinder group 2 and there are only 2 cylinder groups, this will call alloccg() once and fail, falling through to return 0 to alloc(). Note that no matter how many times we end up in alloccgblock() it has the same arguments, so it would fail the same way.

In alloc(), it notes that we did not get a block returned and assumes this is because some other thread grabbed the last block. It then treats the whole thing as if we ran out of space and returns ENOSPC to lufs_alloc(). lufs_alloc() catches this, frees up everything and returns the error (ENOSPC) to lufs_enable(), which in turn catches it cleans up and returns it to ufs_fiologenable() and the error is eventually passed back to user space. While not exactly the error we would have hoped, the end result would be that logging would not be turned on and the system would not panic due to this corrupted filesystem.

I'll log this as a bug against ufs for Solaris 10 and nevada

Update

I have logged CR 6492771 against this issue. The link for the CR should work some time in the next 24 hours, but the content as logged is pretty much a cut and paste from this blog entry.

Update 2

The bug I logged has been closed as a duplicate of

4732193 ufs will attempt to mount filesystem with blatantly-bad superblock

Technorati Tags: , , ,

Comments:

Post a Comment:
Comments are closed for this entry.
About

* - Solaris and Network Domain, Technical Support Centre


Alan is a kernel and performance engineer based in Australia who tends to have the nasty calls gravitate towards him

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today
Links
Blogroll

No bookmarks in folder

Sun Folk

No bookmarks in folder

Non-Sun Folk
Non-Sun Folks

No bookmarks in folder