Friday Apr 25, 2008

Why I run the latest Solaris build on my laptop

While on my holidays I was blissfully disconnected from the internet but still had to have my laptop on hand to empty my camera. This allowed me to trip over this bug that I filed on my return 6691387 which is already destined to be fixed in build 89. Result.

Tuesday Feb 05, 2008

Good Morning Build 81, or not.

I did not even get a chance to login to the Sun Ray server running build 82 before it had crashed twice. So all was not well. A bit of digging and it was looking like a problem somewhere in portfs with kmem corruption. Since the problem was easily reproducible (boot system login and use for a few hours) I got the lab staff to set kmem_flags to 0xf in /etc/system and boot again.

Sure enough this morning there were two more crash dumps with variations of this in the message buffers:

kernel memory allocator: 
duplicate free: buffer freed twice
buffer=60063bfed60  bufctl=300f08886b8  cache: kmem_alloc_32
previous transaction on buffer 60063bfed60:
thread=300f43dac60  time=T-0.000269600  slab=300f08761e0  cache: kmem_alloc_32
kmem_cache_free+30
port_pcache_remove_fop+44
port_pfp_setup+198
port_associate_fop+2b8
portfs+2c8

panic[cpu512]/thread=300f43dac60: 
kernel heap corruption detected

> $c
vpanic(12ac480, 5, 2c8, 1, 18de000, 12ac400)
kmem_error+0x4e8(18de000, 3000005ae08, 60063bfed60, 12ac400, 12ac478, 
2afdfbc8220)
port_associate_fop+0x408(16, 7, 4a330, 16, 4a330, 2a10424d968)
portfs+0x2c8(1, 0, 7, 2a0, 0, 4a330)
syscall_trap32+0xcc(1, a, 7, 4a330, 10000006, 4a330)
> 

Looking at the code it appears that if port_pfp_setup encounters an error it frees the some kernel memory twice. Specifically it frees the memory pointed to by the cname local variable in port_associate_fop twice. Hence the random panics. The diffs for the fix are:


\*\*\* port_fop.c  Fri Oct 26 08:58:01 2007
--- /tmp/cg13442/port_fop.c     Tue Feb  5 14:04:21 2008
\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*
\*\*\* 1306,1311 \*\*\*\*
--- 1306,1312 ----
                if (error = port_pfp_setup(&pfp, pp, vp, pfcp, object,
                    events, user, cname, clen, dvp)) {
                        mutex_exit(&pfcp->pfc_lock);
+                       cname = NULL;
                        goto errout;
                }

I have just files this bug:

6659309: port_associate_fop frees a buffer twice if port_pfp_setup returns an error.

What I don't know is why we suddenly started seeing the bug. Is it that build 82 exercise event ports more or that the bug has been revealed by some other change? Either way it make me nervous for my home server running, you guessed it, build 82! At least next time someone asks why we bother running a Sun Ray server on the latest greatest nevada bits I have a preprepared place to send them. It is here.

Wednesday Jun 13, 2007

Home server back to build 65

My home server is taking a bit of a battering of late. I keep tripping over bug 6566921 which I can work around by not running my zfs_backup script locally. I have an updated version which will send the snapshots over an ssh pipe to a remote system which in my case is my laptop. Obviously this just moves the panic from my server to the laptop but that is a very much better state of affairs. I'm currently building a fixed zfs module which I will test later.

However the final straw that has had me revert to build 65 is that smbd keeps core dumping. Having no reliable access to their data caused the family more distress than you would expect. This turns out to be bug 6563383 which should be fixed in build 67.

About

This is the old blog of Chris Gerhard. It has mostly moved to http://chrisgerhard.wordpress.com

Search

Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today