Tuesday Jun 17, 2008

Trap for the unwary

I did a bios upgrade on my laptop the other day - from A05 to A08. Thought nothing of it until I re-installed the beast with build 91 to get some ZFS root goodness. (Note that currently you have to use the text-mode installer to do this).

xVM told me, none too politely, that it couldn't find any virtualization capabilities in my cpus, so it wasn't going to be my friend any more.

I logged 6714698 snv_91 xVM spurious failure on VT-enabled hardware and provided what I thought was enough info (prtpicl -v and prtconf -v output). Turns out I should have also provided the output from xm info and xm dmesg. When I did, I noticed these lines:
...
xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p 
...

and
(xVM) Processor #0 6:15 APIC version 20
(xVM) Processor #1 6:15 APIC version 20
(xVM) IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
(xVM) Enabling APIC mode:  Flat.  Using 1 I/O APICs
(xVM) Using scheduler: SMP Credit Scheduler (credit)
(xVM) Detected 2194.558 MHz processor.
(xVM) VMX disabled by Feature Control MSR.
(xVM) CPU0: Intel(R) Core(TM)2 Duo CPU     T7500  @ 2.20GHz stepping 0b
(xVM) Booting processor 1/1 eip 90000
(xVM) VMX disabled by Feature Control MSR.
(xVM) CPU1: Intel(R) Core(TM)2 Duo CPU     T7500  @ 2.20GHz stepping 0b
(xVM) Total of 2 processors activated.


What the...?


Quick jump into the bios revealed that there was a new option - Virtualization support. It was, of course, turned off by default. Turning it on and booting the xVM kernel showed me some much nicer output from those commands:
xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p
                             hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 

and
(xVM) Processor #0 6:15 APIC version 20
(xVM) Processor #1 6:15 APIC version 20
(xVM) IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
(xVM) Enabling APIC mode:  Flat.  Using 1 I/O APICs
(xVM) Using scheduler: SMP Credit Scheduler (credit)
(xVM) Detected 2194.555 MHz processor.
(xVM) HVM: VMX enabled
(xVM) VMX: MSR intercept bitmap enabled
(xVM) CPU0: Intel(R) Core(TM)2 Duo CPU     T7500  @ 2.20GHz stepping 0b
(xVM) Booting processor 1/1 eip 90000
(xVM) CPU1: Intel(R) Core(TM)2 Duo CPU     T7500  @ 2.20GHz stepping 0b
(xVM) Total of 2 processors activated.


Now as soon as I get a spare cycle or three, I can go and see about building an S10 domU for backport builds. That'll be fun!

Friday Dec 07, 2007

Bios bugs annoy the heck out of me

In the last few days I've been kinda-sorted prevented from successfully LiveUpgrading due to a freakin' annoying bug in my Ultra20-M2 system bios:

6636511 u20m2 bios version 1.45.1 still can't distinguish disks on the same sata channel

(It's in a closed prod/cat/subcat, sorry).

The gist of the bug is that I've got two identical Seagate 320Gb disks (ST3320620AS, 320072933376 bytes) in my system, providing /, /zroot (for my zones, it's ufs), and sink - my zpool. No matter which two SATA ports I plug those two disks into, Shidokht's /sbin/biosdev util cannot do anything but report either no disks found, or (if run with -d) that the matchcount for the devices is greater than 1.

This means that /usr/lib/lu/lumkboot, which is called as part of lucreate and friends, cannot do the needful. Hence LU fails.

Yesterday I finally cracked and went off to purchase two new 320Gb disks (one Western Digital, the other a Samsung) in order to see how deep the bug goes. This became particularly important after JanD attempted to reproduce

6628268 u20 and u20m2 + snv_75a with non-global zones refuses to allow LU (lucreate)

with an u20m2 and two identical Hitachi 250Gb disks. He wasn't able to, despite having the same model disk, with the same firmware version in each slot.

At the moment my box is having a grand old time, 1hr10 into a zpool replace:


farnarkle:jmcp $ zpool status sink
pool: sink
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scrub: resilver in progress, 67.52% done, 0h28m to go
config:

NAME STATE READ WRITE CKSUM
sink DEGRADED 0 0 0
mirror DEGRADED 0 0 0
c2t0d0s7 ONLINE 0 0 0
replacing DEGRADED 0 0 0
c3t0d0s7/old FAULTED 0 0 0 corrupted data
c3t0d0s7 ONLINE 0 0 0

errors: No known data errors

To get to the point where zpool could replace the device, I made sure the slices on the new disk were in order, then ran zpool replace sink c3t0d0s7. That's it - it's really nifty.

I've got one more thing to try (swapping the cables around for c3t0 and c3t1), which I think I'll have a go at in about 40 minutes. Whatever the results of that test, it's not looking good for the bios when it's got Seagate-branded disks attached.

About

I work at Oracle in the Solaris group. The opinions expressed here are entirely my own, and neither Oracle nor any other party necessarily agrees with them.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today