Bringing up Solaris domain 0 on Xen

Bringing up Solaris domain 0 (dom0) on Xen was surprisingly easy. Mostly because all of the hard work was already done by other people. The hard work which remained, was also done by other people :-)

I apologize in advance for giving credit to the wrong folks or for taking credit for something I didn't do. This was such a blur, it all tends to blend together...

Obviously, this won't cover everything. I tried to talk about some of the more interesting parts. Well, interesting is relative of course :-)

To start with, first you need to be able build xen on Solaris. You could actually cheat and start with a xen image and skip all the user apps to manage domUs. But that seems kind of pointless unless you have tons of bodies to throw at the effort, which we don't, thankfully.

John L and Dave already had Xen building, so all I had to do was ask them what I needed to do to build it.. The first thing you need are changes to gcc and binutils that's shipped in /usr/sfw. Which is why you need to download unofficially updated SUNWgcc, SUNWgccruntime, and SUNWbinutils packages in order to build the xen sources on Solaris (they will be officially updated at some point in the future).

There were two things that John L fixed. The first one was a bug in how we build gcc (can't find it's own ld scripts). See this bug.

The second fix was to add a -divide to the binutils gas to not treat / as a comment. John got this change back to to binutil cvs repository, but it hasn't made it out in a release yet (as far as I know).

Of course, Dave and John L had to change stuff in the xen.hg gate to get it to compile too. If you look at the source, you'll notice there are a few things we don't try and compile current, e.g. hvm related support. Then, of course, you need to test it to make sure the xen binary worked (user apps would have to wait until Solaris dom0 was up). Not sure if it just worked or they had to debug it, but it was working by the time I got to it :-)

So after I built my xen gate, put xen.gz in /boot (starting with 32-bit dom0), and tried to boot a i86xen (vs i86pc) version of the kernel debugger (kmdb). Again, I was following footsteps here. John L had done a ton of work getting kmdb to work in domU (since we already had Solaris domU running on a Linux dom0). And Todd and/or John L had already debugged kmdb on a Solaris dom0. So I was at kmdb prompt ready to venture into unknown territory.

So before I could boot my Solaris dom0, I had to build one. Up to this point, we only had the driver changes we needed for domU. Before xen, we only had one x86 "platform", i86pc.

This is unlike SPARC, which usually gets a new "platform" or every major architecture change (e.g. sun4m, sun4u, sun4v). On SPARC, you'll also see machine specific platmod's and platform directories to provide additional functionality and modules which are specific to a given machine (e.g. /platform/SUNW,Sun-Fire-880).

For xen (on x86), we have a new "platform", i86xen. For Solaris dom0, we we're missing all of the drivers which were in i86pc (i.e. they did not show up in i86xen). The vast majority of these drivers aren't platform specific and can go into intel, i.e. doesn't have any platform specific code (which today is i86pc and i86xen). So I had to try to move each driver over to intel and see if it had platform specific code or not. Since there was only one intel "platform" in the past, the lines we're a little gray at times. But I finally got through it and ended up moving around 40 drivers in src/uts and a little over 15 in closed/uts, to intel from i86pc. For the rest, I need to create makefile in i86xen to build a platform specific version of these drivers.

Now I had a Solaris dom0 kernel to boot. I setup my cap-eye install kernel, rebooted into kmdb, and :c'd into a new world. The majority of the hard work was already done bringing up domU. The CPU and VM code for domU, done by Tim, Todd, and Joe just worked for domain 0. That made life very simple.

The first problem I ran into was the internal pci config access setup in mlsetup. It was initially shutoff for domU, I had added it back in for dom0. However, this requires a call to the BIOS, which xen doesn't allow. So I changed the code to default to PCI_MECHANISM_1 for i86xen dom0.

From there, the next problem I ran into was ins/outs weren't working.. That was fixed with a HYPERVISOR_physdev_op (PHYSDEVOP_SET_IOPL), which ended up being slightly wrong and fixed by Todd before we released.

Now I was at the point where we are attaching drivers and the drivers are trying to map in their registers. Joe had done a bunch of work in the VM getting the infrastructure ready for foreign PFNs, which are basically PFN's which are tagged to mark then as containing the real MFN, instead of being present in the mfn_list. Since this was the first time trying that code out, I ran into a couple of minor bugs. The more interesting problem was that Xen was using one of the software PTE bits in a debug version of Xen which conflicted with the bit we we're using to mark the page as a foreign. I commented out that feature and rebuilt Xen and continued on while Joe worked on changing the PTE software bits to be encoded instead of individual flags to avoid bit 2 int PTE software field.

I had already changed the code in rootnex to convert the MFN (device register access) to a foreign PFN during ddi_regs_map_setup(). So once the PTE software bits were cleared we were sailing through the driver reading its device registers and on to mapping memory for device DMA.

I had also modified the rootnex dma bind routine. When we're building dma cookies, we need to put MFNs in the cookies instead of PFNs. I had a couple of bugs in that code, fixed that up, then ran into the contig alloc code path. I hadn't coded up the contig alloc code path changes yet (were we want to allocate physically contiguous memory). So I cheated and temporarily took out all the drivers which required contig alloc, and did the contig alloc code at a later time (my boot device didn't need it :-) )

Now I was up to vfs_mountroot(). This is where the Solaris drivers start taking over disk activity and stop using the BIOS to load blocks. This is also where we first start noticing problems if interrupts don't work.

This is where I handed off the Stu :-). This was the last of the hard problems. Stu had been busily working on Solaris dom0 interrupt support. A mix of event channels, pcplusmp, ACPI, and APICs. Something I would never wish on anyone. Stu got it up and working remarkably fast (something he should talk about :-)) and I was back and running up to the console handover.

The console config code is a little bit messy in solaris. I waded through that for a little bit. All of the code was originally in the common intel part of the code. I moved the platform specific code to i86pc and i86xen then have a different implementation in i86xen which basically always sends the Solaris console to the Xen console. Not sure if it will stay that way in the end, but that makes the most sense IMO.

And from there, I was at the multi-user prompt..

Some other interesting problems I ran into during the bringup. I had to have isa fail to attach on a Solaris domU. The ISA leaf drivers assume the device is present and bad things happen. There were a couple places in the kernel where they have hard coded physical address which it tries to map in (e.g. psm_map_phys_new; the lower 1M of memory, used for BIOS tables, etc.; and xsvc used by Xorg/Xsun). And we found out the hard way that Xen's low mem alloc implementation is linux specific. Only allocates memory < 4G && > 2G. We need to redo our first pass at implementing memory constrained allocs.

As far as booting 64-bit Solaris dom0, it booted up the first time.

We'll that enough for now.. I'll save the bringup of domUs on a Solaris dom0 for the next post. That was a little more challenging...


Post a Comment:
Comments are closed for this entry.



« April 2014