Monday Sep 22, 2008

In praise of the blogosphere - Upgrading OpenSolaris past build 91

I can't believe it's 4 months since I last blogged. Oh well it's been a hectic time. And this is just a quick not for my own memory jogging purposes really. I have been having a pain of a time upgrading from OpenSolaris build 91. I've been away alot (fantastic holiday thanks very much) and had been working on my laptop under VirtualBox. Any way. I would do my usual

$ pfexec pkg refresh --full
$ pfexec pkg install SUNWipkg
$ pfexec pkg image-update

However at the end of downloading all the packages It would give me an error like

pkg: attempt to mount opensola���2 failed.
pkg: image-update cannot be done on live image

I hadn't really given it much thought for a while but when I did google the message it took me straight to this page

http://louisbotterill.blogspot.com/2008/07/open-solaris-to-b93.html

Which pointed out the missing information(can't workout how I missed it - may be it was while I was away)

http://defect.opensolaris.org/bz/show_bug.cgi?id=2387

So any way upgrading it on a copy of the BE really helped and I am now on the latest bits. Thanks to Louise Botterill for the tips






Thursday Feb 14, 2008

Debugging sparc really (and I do mean really) early boot problems

For some work I've been doing I've had to work out how to debug the sparc boot process, before you can get to kmdb. And yes you can do it, it's just not that easy. So I thought I'd put it on my blog, in case I lose the notes I made in a mail to myself, and it might be of interest to some of you.

First off get as much of the diagnostics available from the OBP as possible 

 {1} ok setenv fcode-debug? true
fcode-debug? =          true
{1} ok
{1} ok setenv diag-switch? true
diag-switch? =          true
{1} ok reset-all

The reset-all is important as it saves the options the the nvram.

Now we try and boot it up - before anything is loaded. Note this requires a debug kernel, but if you're playing in this space and you're on sparc then you probably know that already

{ 1} ok boot disk0 -F kernel/unix -H 

You will see the boot fail like this

Rebooting with command: boot disk0 -F kernel/unix -H                 
Boot device: /pci@1c,600000/scsi@2/disk@0,0  File and args: -F kernel/unix -H
ufs-file-system
Halted with -H flag.
Warning: Fcode sequence resulted in a net stack depth change of 1

The file just loaded does not appear to be executable.

This is expected and how we get to start playing with breakpoints really early on. Note the unix module is not yet loaded so we now have to load the unix module. To do this we load the boot forth code and copy what it does

{1} ok see do-boot
: do-boot  
   parse-bootargs halt? l->n if   
      " Halted with -H flag. " type cr exit
   then  get-bootdev load-pkg mount-root zflag? nested? invert and
   l->n if   
      fs-name$ open-zfs-fs
   then  load-file setup-props exec-file
;

So by copying what do-boot does we can intercept the boot process

{1} ok get-bootdev load-pkg mount-root
{1} ok load-file setup-props
Loading: /platform/SUNW,Sun-Fire-V240/kernel/unix
Loading: /platform/sun4u/kernel/unix

{1} ok 

Now we can start some more magic. A DEBUG kernel will check the stop-me property in kobj_start(). This is something we have to populated in the boor properties which is why we've done all this messing around to get to this point

{1} ok cd /chosen
{1} ok 00 0 " stop-me" property
{1} ok .properties
stop-me                 
fs-package               ufs-file-system
whoami                   /platform/sun4u/kernel/unix
impl-arch-name           SUNW,Sun-Fire-V240
elfheader-length         001c55c0
elfheader-address        51000000
bootfs                   fed85a80
fstype                   ufs
bootargs                 -F kernel/unix -H
bootpath                 /pci@1c,600000/scsi@2/disk@0,0:a
mmu                      fff74080
memory                   fff74290
stdout                   fed97b90
stdin                    fed97ea8
stdout-#lines            ffffffff
name                     chosen

We can now start the boot process using exec-file. It will stop immediately because of the stop-me property (ctrace gives me the stacktrace)

{1} ok exec-file
Type  'go' to resume
{1} ok ctrace
PC: 0000.0000.f004.81e4
Last leaf: jmpl  0000.0000.f005.d274   from 0000.0000.0100.8aec client_handler+70 
     0 w  %o0-%o7: (f0000000 16 f0000000 6d 73 6 fedcb441 1008aec )

call 0000.0000.0106.bea8 p1275_sparc_cif_handler        from 0000.0000.0106.7de8 prom_enter_mon+24 
     1 w  %o0-%o7: (f005d274 fedcbda8 1839400 106af00 185fc00 f005d274 fedcb4f1 1067de8 )

call 0000.0000.0106.7dc4 prom_enter_mon        from 0000.0000.0101.9ed4 kobj_start+30 
     2 w  %o0-%o7: (0 10bdaf0 f002d224 1 1817700 1821dd8 fedcb5c1 1019ed4 )

call 0000.0000.0101.9ea4 kobj_start        from 0000.0000.0100.7ac8 _start+10 
     3 w  %o0-%o7: (f005d274 0 0 0 10bd800 181fc00 fedcb701 1007ac8 )

From this point we have access to the unix symbols and can start setting break points. For example

{1} ok load_primary +bp
{1} ok go
0000.0000.010a.c7b0 load_primary         save        %o6, ffffffffffffff30, %o6
{1} ok ctrace
PC: 0000.0000.010a.c7b0 load_primary    
Last leaf: call 0000.0000.010a.c7b0 load_primary        from 0000.0000.010a.b46c kobj_init+d8 
     0 w  %o0-%o7: (1879400 0 fedcbe78 184f000 1879340 181ac00 fedcb111 10ab46c )

call 0000.0000.010a.b394 kobj_init        from 0000.0000.0101.9fd0 kobj_start+12c
     1 w  %o0-%o7: (f005d274 185c800 184f000 fedcbe78 184f3f8 184e400 fedcb5c1 1019fd0 )

call 0000.0000.0101.9ea4 kobj_start        from 0000.0000.0100.7ac8 _start+10 
     2 w  %o0-%o7: (f005d274 7 0 51000040 51000000 51000040 fedcb701 1007ac8 )

I'm interested in getting some more module loading debug info out so lets set moddebug to 0xf

{1} ok moddebug l?
0

(displays current value of a long)

{1} ok F moddebug l!
{1} ok moddebug l?
f
{1} ok

(set the long to be F then display it again)

Now lets see what additional info I get

 {1} ok go
/kernel/fs/sparcv9/specfs symbol _info multiply defined
/kernel/fs/sparcv9/specfs symbol _init multiply defined
Returned from _info, retval = 1
init_stubs: couldn't find symbol in module fs/specfs
(Can't load specfs) Program terminated

OK That doesn't tell me much more but you get the idea. You can access the symbols - set break points, set variables. In addition you can  dump out memory with dump, single step with step and loads of other things that you might want to do, but this at least will act as a memory jogger for me

Let me know if you found this useful.

Chris















Thursday Nov 01, 2007

Installing Indiana/Opensolaris

For a few days recently I have been looking at the future of packaging, pkg(5) or IPS. IPS looks really powerful and quite simple. It will allow us to generate fixes and deliver them much more simply. What I've been thinking about is how and when will we generate fixes using this mechanism.

Any way as a result I've signed up for pkg-discuss-AT-opensolaris-DOT-org and indiana-discuss-AT-opensolaris.org. Both of these are very active and full of interesting discussions (and arguments) and ideas. Anyway, it's not surprising there has been so much activity recently. Today indiana-discuss announced the launch of the developer preview of the opensolaris binary distribution. So I tried it out on a couple of machines. My laptop first, an Acer Ferrari 4005. Everything just worked. The LiveCD booted up, really quickly actually, well done the team for getting the performance up so well. Even wireless worked, though that's probably because I've already swapped the Broadcom wireless miniPCI card for an Atheros one. Unfortunately I have no spare slices available on the laptop so I move on to my next machine.

This is my home PC, usually running WindowsXP for the kids, it has never successfully Solaris for reasons that will become apparent. I have just upgraded the hard drive so theres 60Gb partition free for me to do some damage.

Booting the livecd failed, or rather Xorg failed to display anything. My machine is an old Athlon XP2600 with an AGP radeon x1600pro graphics card. Great for games, but unfortunately the Solaris/OpenSolaris Radeon driver doesn't support it. Fortunately Stephan Hahn blogged about how to get Xorg to use the vesa driver from the livecd. With that in place I got the gnome gui up and gave the install a go.

The installer uses dwarf-caiman, a cut down slim line installer which is nice and easy to navigate. The install itself was really quick - there's only a CD's worth installed. The rest should be added later over the web from the IPS repository. Unfortunately that is where my old machine creaked too much. The onboard ethernet is an nforce2 gigabit ethernet. It should work with the nge driver but I think it's just too old. I tried adding an alias for it using 

# add_drv '"pci10de,66"' nge

 
But even though I could plumb it there was no traffic going through it :-( I guess I'll have to find another ethernet card.

The install claimed it failed, but it did come up fine after a reboot, though I had to add a user again at single user because the useradd hadn't worked. Warning here. root is just a role that users can take on - so you can't log in as root as you might expect from a "normal" solaris system.

I'm pretty impressed. Nice installer, lightweight liveCD to get you started. zfs root and pkg(5) to add new stuff (or it will when I get a new ethernet adapter. I wonder if I can get one of my old USB wireless sticks to work :-). Plus it seems to be more responsive under OpenSolaris than windows XP.

 Do give it a go, it is one vision of the future of opensolaris

 

Chris 

 



 

Monday Sep 24, 2007

On the road again

So I'm on the road again. The Sun Tech Days this times I'm in Rome and Milan is later in the week. I've just talked about "What is OpenSolaris" and "OpenSolaris Virtualization"

It's great to connect with real people who do or want to use OpenSolaris and interest in  out xVM and Zones based technologies.

Please take a look at the link and the presentations should be uploaded in the next few days.

Any way - great to be in Rome, just wish I was closer to the center. I made the trek in to see the Colosseum. Some say it's not as impressive as they were expecting. I have to say I had no expectations and was mightily impressed.  I will post a link to some photos when I've uploaded and checked them

 


 

 

 

 

Thursday Jul 19, 2007

Starting out with Solaris on Xen

As you may have seen from the announcement and John's blog we have a new set of Solaris on Xen bits available for download. A lot has changed in the (almost) year since the last drop. Certainly things are a lot easier set up than they were back then.

First big difference I notice is that you can install these bits straight from the DVD which means no mucking around with bfu.

Once it is installed also you have the joys of much newer Solaris builds including improvements to networking and removable media (but that isn't the point of this post).

Of course the thing you really want to do is run multiple operating systems so (while there are documents here I always think it's nice to see peoples use cases. Find out how they got things working.

I'm going to use zfs for storage so I made sure I had a large amount of space available for a zpool

# zpool create guests c2d0s7

First gotcha. After install the default boot entry in the grub menu.lst is for solaris on metal (ie not booting under Xen). You can change that before rebooting or select Solaris dom0 from the grub menu.

Check you are running under Xen by looking at uname -i

dominion# uname -i
i86xpv

(dominion is the name of my host)

If that says i86pc then you're not booted under Xen, i86xpv is the new platform modified to run on Xen.

I found that I accidentally booted on metal first time, and when I then booted under Xen the services weren't enabled. I had to manually enable them. (If you boot straight in to Dom 0 they start.

dominion# svcs -a | grep xctl
online         10:51:04 svc:/system/xctl/store:default
online         10:51:11 svc:/system/xctl/xend:default
online         10:51:11 svc:/system/xctl/console:default
online         10:51:16 svc:/system/xctl/domains:default

If it says anything other than online, enable them with

# svcadm enable "service name"

I use a zpool to create my disk devices for my domains. This has huge advantages, such as the ability to quickly snapshot a domain (say after install) so you can always return to that state. Also you can clone a snapshot so if you want to have many similar domains (say multiple solaris development environments) you can clone an install and then only the changes between the domains are stored (zfs being copy on write).

To set this up you need to create a zvol on your zpool

# zfs create -V 10G guests/solaris-pv

This creates a zvol of up to 10G in size. Unused space is still free for other users of the pool to allocate.

You can access the device for this zvol using

/dev/zvol/dsk/guests/solaris-pv

So that's simple - how do we install a Solaris domain? First off I create an install python config file. (Soon there will be a tool to manage the install for you but that's not really ready yet).

This python file describes some simple things about the domain like where the disk and cdrom is.

dominion# cat /guests/configs/solaris-pv-install.py 
name = "solaris-pv-install"
memory = "1024"
disk = [ 'file:/guests/isos/66-0613-nd.iso,6:cdrom,r', 'phy:/dev/zvol/dsk/guests/solaris-pv,0,w' ]
vif = [ '' ]
on_shutdown = 'destroy'
on_reboot = 'destroy'
on_crash = 'destroy'

Name is obvious, and I've copied the iso image to be a file to speed up install.

You can kick off the install just by starting the domain

dominion#  xm create -c /guests/configs/solaris-pv-install.py

This says start the domain and give me a serial console access to it. You then do a normal Solaris install. Once complete you should create a second python file to boot off the zvol. but first I'm going to snapshot it so I can quickly duplicate it (though I really should sys-unconfig it first to make me input the hostname and ip info again.)

dominion# zfs snapshot guests/solaris-pv@install
dominion# cat solaris-pv.py 
name = "solaris-pv"
memory = "1024"
root = "/dev/dsk/c0d0s0"
disk = [ 'phy:/dev/zvol/dsk/guests/solaris-pv,0,w' ]
vif = [ '' ]
on_shutdown = 'destroy'
on_reboot = 'destroy'
on_crash = 'destroy'

and create it with

# xm create -c solaris-pv.py

This then comes up as per a normal solaris boot, if you've given it an ip address during the install or set it to use dhcp you should be able to log in to it using ssh. The networking is effectively bridged, that is to say, you need a real IP address for each domain on the same network as the Dom0.

So the next question I always get is "Can I run windows as a domU". And the answer is "maybe". What we have done up till now is use a paravirualised domU. That is one that has been modified to run on Xen. Anything that would trigger a privileged operation (interrupt, privileged instruction etc) is modified to be a call to the hypervisor. This is nice and fast, but some operating systems haven't had this treatment.

However with the advent of the intel core2duo and Rev F Opteron/Athlon64 (socket AM2) processors, some hardware support for virtualisation has been built in to the chip. This detects these privileged operations and redirects control back to the hypervisor to do "the right thing"

With Xen these are referred to as HVM domains.

Russ is going to be blogging more about these so I won't go in to too much detail, but if you want to know if your system is HVM capable, I wrote this simple program to tell you

dominion# cat hvm-capable.c 
#include < sys/types.h>
#include < sys/stat.h>
#include < fcntl.h>
#include < unistd.h>
#include < string.h>
#include < errno.h>
#include < stdio.h>

static const char devname[] = "/dev/cpu/self/cpuid";

/\*ARGSUSED\*/
int
main(int argc, char \*argv[])
{
        struct {
                uint32_t r_eax, r_ebx, r_ecx, r_edx;
        } _r, \*rp = &_r;
        int d;
        char \*s;
        int isamd = 0;
        int isintel = 0;

        if ((d = open(devname, O_RDONLY)) == -1) {
                perror(devname);
                return (1);
        }

        if (pread(d, rp, sizeof (\*rp), 0) != sizeof (\*rp)) {
                perror(devname);
                goto fail;
        }

        s = (char \*)&rp->r_ebx;
        if (strncmp(s, "Auth" "cAMD" "enti", 12) == 0) {
                if (pread(d, rp, sizeof (\*rp), 0x80000001) == sizeof (\*rp)) {
                        (void) printf ("processor is AMD ");
                        /\*
                         \* Read secure virtual machine bit 
                         \* (bit 2 of ECX feature ID)
                         \*/
                        (void) close(d);
                        if ((rp->r_ecx >> 2) & 1) {
                                (void) printf("and processor supports SVM\\n");
                                return (0);
                        }
                        (void) printf("and does not support SVM\\n");
                } else {
                        (void) printf ("error reading features register");
                        (void) close(d);
                        return (1);
                }
        } else if (strncmp(s, "Genu" "ntel" "ineI", 12) == 0) {
                if (pread(d, rp, sizeof (\*rp), 0x00000001) == sizeof (\*rp)) {
                        (void) printf ("processor is Intel ");
                        /\*
                         \* Read VMXE feature bit
                         \* (bit 5 of ECX feature ID)
                         \*/
                        (void) close(d);
                        if ((rp->r_ecx >> 5) & 1) {
                                (void) printf("and processor supports VMX\\n");
                                return (0);
                        }
                        (void) printf("and does not support VMX\\n");
                } else {
                        (void) printf ("error reading features register");
                        (void) close(d);
                        return (1);
                }
        }
fail:
        (void) close(d);
        return (1);
}

SVM is AMD's implementation of HVM while VMX is Intel's.

And just a teaser of what you can expect. (right click - view image to see it full size)

Here you see a solaris paravirtualized vm being installed, a windows vista hvm domain. In the top left corner you can see the virtual machine manager. A new management gui that will help manage domains.

Sorry this is going to be pretty hard to see unless you view the image in it's original size (1600x1200, yes virtualisation helps you use up those wasted resources including screen real estate)

Monday Jan 22, 2007

Using the OpenSolaris Mercurial repository


I finally decided to have a look at how you get OpenSolaris from the Mercurial repository I've been put off by the fact at the office we're behind a fire wall, so you have to pull it through a Socks proxy.

I turns out it's absolutely trivial to do (the instructions are all on the OpenSolaris WebSite.

If you're behind a firewall that requires you to have a proxy for ssh th first thing you need to do is set up ssh to use a proxy. I can't help you with that other than to say add the following line to your ssh config for opensolaris.org

$ cat ~/.ssh/config
Host \*.opensolaris.org
ProxyCommand /usr/lib/ssh/ssh-socks5-proxy-connect -h [Proxy IP address] %h %p
Compression yes

(thanks to Erik and Stephen for pointing out the compression option)
Putting in the IP address of your socks proxy.

Now you can clone the repository

$ hg clone ssh://anon-AT-hg.opensolaris-DOT-org/hg/onnv/onnv-gate
adding changesets
adding manifests
adding file changes
added 3487 changesets with 67524 changes to 43099 files
39742 files updated, 0 files merged, 0 files removed, 0 files unresolved


It took only 37 minutes to my home machine which is much faster than a full bringover in teamware

Now to start playing with some of the build tools

Sunday Sep 24, 2006

EuroOSCON 2006

EuroOSCON 06

This week (18th - 21st September) I've had the opportunity to attend and be involved in EuroOSCON06. This was primarily to increase my understanding of opensource, but also to promote Open Solaris. This was my first OpenSource conference so first I'll make a few general observations before moving on to details about the sessions.

EuroOSCON06 was this year in Brussels, a city I'd not visited before. It's a surprisingly small city for the self proclaimed capital of europe. There are some very beautiful parts as well as some rather seedy parts, and being small the seedy and wonderful nestle uncomfortably together.

There were some other Sun employees (Martin Man Peter Dennis, Patrick Finch, Darren Kenney and Gary Pennington). I'd met a few of them before but we're from very different backgrounds so we had different reasons for wanting to attend EuroOSCON and promote OpenSolaris.

So I took the EuroStar from London (I was booking the trip just as the security scare happened last month so thought this would be easiest). Met with Peter Dennis on the train and worked through some demos we could show to people. We had a BOF and a Booth on Wednesday so thought we'd try and show some cool stuff.

The Demos

We decided we'd show how easy it was to set up and build OpenSolaris. I had a media kit with me on the train and by the time I was in Brussels had installed a build machine environment on my laptop and was happily building code, cool!. We also wanted to show some zfs features, and some zones features. There is a new facility in OpenSolaris to allow you to create a Zone on one system (preferably on a zfs file system, and then take a copy of it to create a new zone. If your using zfs it will sanpshot the filesystems rather than copying data meaning you get use the zfs snapshot facility meaning it is rather quick. This is done with zonecfg clone -s

You can then dettach that zone from your current system (using zoneadm dettach),and as the zpool was on an external disk we moved the USB disk to another laptop and imported the pool (zpool import ) and attach the zone to the new laptop (zoneadm -z attach -F -n ). It then just works as it did on the old system. I was amazed and can see how useful this is going to be.

Observations

Some things struck me. First - everyone uses Mac their. This surprised me as although it is based on an opensource OS, it is far from and OpenSource product. Second - There are a lot of people on the conference gravy train, they obviously go to a lot of these. That's fine, but it did distort the audience a bit. Finally Licensing both of Projects and data is still a big worry for this community.

It was well worth going this year to promote OpenSolaris. People were falling queing up at the booth to talk to us so on to some more detail

The Booth and the BOF

Wednesday had us setting up the booth for OpenSolaris and manning it. We had intended to take turns, but we were so busy for the whole of the morning that we all were talking pretty much all morning to various interested people. We had some "OpenSolaris Starter Kits" Containing the install media, the source and the compilers, along with a coupl of livedvd images. Overall they went like hot cakes everyone seemed excited by OpenSolaris. I even persuaded a guy from Google to take a look. The afternoon was a little quieter so we could go off to a few sessions, but Still a steady stream of interested people.

I gave more demos than I can remember - people were excited about how easy it was to use ZFS and how simple building opensolaris was. Dtrace is still a big thing and now it's in macOS the community can see am example of why they should be involved with OpenSolaris

The BOF was at the same time as one from Ubuntu's Mark Shuttleworth, so we weren't expecting many and we had a few, though not as many as I would have hoped. We talked through SMF, Zones, dtrace, ZFS among other things and the people seemed interested. The funny thing was that Mark Shuttleworth failed to turn up and the Ubuntu BOF got rather heated.

The Sessions

KeyNote Speakers: Tim Orielly, Tor Norretranders

Tim Orielly spoke about data, and licensing (as you'll see licensing seems to be a hotly discussed topic at the conference. It's not something that has ever interested me but it is clearly important. Tim was commenting that if you put it on Myspace but can't take it anywhere else is it really yours? Same goes for applications that run on the web server rather than in your own system, what is to stop the owner of the webserver doing what ever they want. I feeling he was blowing his own trumpet saying he'd pointed this out in 1998 or something but only now people are taking him seriously. He was trying to say (I think) that in the Web2.0 world where everyone contributes on line the things you put in your blog or on some one else collab site you may not own the rights to.

Tor gave a very funny speach about the motivations for participating in opensource. He stated it was about

Glow-> The nice feeling we get when interacting with someone else
Show -> Showing off and helping others gets you noticed (or laid as he put it)
Flow -> You are constantly changing.
2.0 -> We're getting back to a bartering type economy.

Certainly I enjoyed it and the first two points are clearly right, the rest felt a little forced, but then it was well made.

Industrial strength Email and Calendar: Flaorian von Kurnatowski

Without realising it I'd wandered in to the Products and Services track. Basically Opensource friendly companies promoting their products. That said he didn't push his company Scalix too much. What he observered was that you needed to have a true replacement for Outlook before people would be able to move away from Microsoft. It seems Outlook is very closely tied to all other Microsoft apps and if you remove them you loose a lot of functionality (and he said Outlook is 50% of the license fee too).

Until OpenOffice can provide that or has an equivalent it will not considered by many people. Also 90% of Admins have never done a migration of mail systems so they're scared of it, there need to be good migration tools.

Final point was that Calendar services do not have any standards which is why Calendar infrastructure is even harder to do than email.

Channeling OpenSourced in Europe: Ranga Tangachari

Back in the OpenSource world I was interested by this session. I'd assumed that this would be about getting the most out of OpenSource in Europe but instead it was a talk about how his company made money in Opensource by encouraging the Channel (resellers)

His assertion was that Communities provide innovations and companies provide Products (more the just projects, fully tests and supported things). In the middle are the Channel which adds value by things like locaization and training. Being a pool of deployment experts.

You need to encourage the Channel by giving them what they care about which is

1) Margin
2) Professional Services Opportunities
3) Maintenance (recurring revenue)

Think beyound Downloads they only mean one click, find examples of happy customers.

Big Data and the Open Warehouse: Roger Magoulas

This was a dissapointing presentation about what Orielly do about data storage and data mining. It was unfortunately simply a run down of tricks tips, products and techinques used by the Orielly guy in their data centre. There were some interesting things mentioned though which I will go and look at.

SecondLife and Opensource: Jim Purbrick

I'm intrigued by SecondLife, it's a game where the whole purpose is to make "Stuff" and "Hang Out" and generally share or sell what you do. I have looked at it and it is cool, but I haven't got my head round Why? yet.

SecondLife is not (yet) opensource but the Guy from LindenLab was explaining that the big difference with second life to other MMORPG games is that the players create the world. LindeLabs couldn't have provided enough content to keep people interested, but because it is created by the game community they reacon they get ~6500 man years of content development per year! (not even EA could manage that for one game I think) All of this is a course up for sale or copy depending on the desires of the community member

There are interesting aspects to the way LindenLab have architected their set up, like each now plot of land requires a new server so they're adding new servers at a huge rate.

Over all the crowd were excited by SecondLife. I didn't see them as engaged in any other session

Afternoon Keynotes: Steve Coast, Adrian Holovaty

Steve Coast: Open Data: An interesting view of "Good enough" data. He uses a gps held by volunteers as they go about there daily business to look for a plan of cities, making it available via OpenStreetMap. Most map data in the world is either govenement owned (like the Ordanance Survey) or not of great quality. Creating good enough Open Data which can be shared will be enough for some and will consequently bring down the price of Closed Data).

Adrian Holovaty: Journalism through Programming:

Facinating and slightly scarey view on providing access to the raw data used by journalists via web applications. He works for the Washington Post and is involved in a few projects one of which is Faces of the Fallen. Another one he quoted was to look at your MPs voting reacord TheyWorkForYou. Now I can see the point of this, and will certainly be checking on what my MP is doing, but I am slightly worried that this requires balance. We're asking the public to draw conclusions from only a small amount of the data (as the records published on line are incomplete), where as journalism is all about weighing up all the information and providing a balanced summary for consumption.

Open Useability: Jan Muehlig

A talk to encourage open source projects to include Useability eningeering in the project to make the user experience of Open Source Products as good or better than closed source, something we have often complained about. His greoup OpenUseability is working to promote this and it seems to boil down to publishing best practices, which noone does yet.

Making It Work: Louis Suarez-Potts

A talk about how to build succesful OpenSource projects. He comes from OpenOffice Which is both a succesful project and a really useful product so he should know what he's doing.

He talked about the two different approaches. The organic where a few friends start up with a common goal, and the Sponsored, where a corperate entity is driving towards a set of goals. Eitehr way you need to do the following

Pick the right license
Have a neutral environment (ie safe to contribute)
Have transparent governance and processes
Make decisions in public
Have clear decision paths
Use good communications tools (everyone liks IM these days
Have immediate gratification (easy and fast contribution)
Market your project
Have the right Product

I found this quite encouraging as I felt OpenSolaris has it about right

OpenSource and Freedom: Why Open Standards are crucial to protecting your linux investment: Jim Zemlin

This talk was aimed at promoting the LSB(Linux Standards Base To make sure applications will run on the largest number of Distros. LSB dictates the minimum number of components available within the Distro so your application can rely on them. This is to encourage growth over Microsoft. He quoted what happend in the Unix world when the standards fragmented and he is absolutely right

KeyNote: Florian Muller Roml Lefkowitz

Florian Muller Spoke about lobying in the European Parliament to limit the changes to Patent law which some companies are trying to tighten up to protect their IP, while OpenSource are trying to go the other way. I was left slightly disconcerted that someone with such a one sided view was having an effect on our laws.

Roml Lefkopwitz Spoke about the need to internationalize and localize the source code and languages used in opensource projects. Nice pie in the sky thinking, but misses the point that the source should not be the documentation, we need documentation before we can worry about such things.

Xgl and Compiz - New X11 features and the OpenGL Accelerated Desktop Matthias Hopf

Facinating talk about the future of desktop from Suze At last a talk with lots of technical details and a neat demo at the end demonstrating the desktop mapped on to a 3d cube running two movies and Quake 3 at the same time on different faces of the cube. All of this should soon also be possible in Solaris and I think it's vital we do it.

The End

If you got this far then well done :) It was a lot to read.

Technorati Tags: Solaris OpenSolaris EuroOSCON EuroOSCON06

Tuesday Sep 12, 2006

Building Opensolaris

This weekend I did something I've been meaning to do for a while. I've been putting it off due to lack of time to think about how to approach it without breaking anything.

Any way I finally tried downloading and installing Solaris Express the community edition (build 46) and downloaded all the build tools. I was amazed how easy it was. Within an hour offinishing the downloads, it was building opensolaris.

The main shock for me was the lack of an SCM (Source Code Management) system. Being fully entrenched in the world of ON (Os and Networking) for the last 10 years Teamware (the SCM we use) is just \*what we do\*. So I had to rethink how I'll manage the build. But then using opensolaris.sh from usr/src/tools/env did a good job.

So I know have my own build built from opensolaris running on my system, without anything from within Sun. Cool. Give it a go, it was easy and gives you the chance to play with how things work.

Next step for me - build the Xen bits from outside Sun :)


==== Nightly distributed build started: Mon Sep 11 11:27:18 BST 2006 ====
==== Nightly distributed build completed: Tue Sep 12 03:22:02 BST 2006 ====

==== Total build time ====

real 15:54:44

==== Nightly argument issues ====

Warning: the N option (do not run protocmp) is set; it probably shouldn't be

==== Build environment ====

/usr/bin/uname
SunOS osol-bld 5.11 snv_46 i86pc i386 i86pc

/opt/onbld/bin/nightly myopensolaris.sh
nightly.sh version 1.104 2006/08/29

/opt/SUNWspro/bin/dmake
dmake: Sun Distributed Make 7.7 2005/10/13
number of concurrent jobs = 4

32-bit compiler
/opt/onbld/bin/i386/cw -_cc
cw version 1.20
primary: /opt/SUNWspro/bin/cc
cc: Sun C 5.8 Patch 121016-02 2006/03/31
shadow: /usr/sfw/bin/gcc
gcc (GCC) 3.4.3 (csl-sol210-3_4-20050802)

64-bit compiler
/opt/onbld/bin/i386/cw -_cc
cw version 1.20
primary: /opt/SUNWspro/bin/cc
cc: Sun C 5.8 Patch 121016-02 2006/03/31
shadow: /usr/sfw/bin/gcc
gcc (GCC) 3.4.3 (csl-sol210-3_4-20050802)

/usr/java/bin/javac
java full version "1.5.0_08-b03"

/usr/ccs/bin/as
as: Sun Compiler Common 10 snv_46 08/03/2006

/usr/ccs/bin/ld
ld: Software Generation Utilities - Solaris Link Editors: 5.11-1.545

Build project: group.staff
Build taskid: 62

==== Build version ====

ws.opensolaris

==== Make clobber ERRORS ====

==== Make tools clobber ERRORS ====

==== Tools build errors ====

==== SCCS Noise (DEBUG) ====

==== Build errors (DEBUG) ====

==== Build warnings (DEBUG) ====

==== Elapsed build time (DEBUG) ====

real 11:42:57.3
user 4:41:00.3
sys 2:40:32.5




Technorati Tags:
Solaris OpenSolaris Xen

Monday Jul 17, 2006

Xen for dummies - part1


I've been helping out the Xen team to try out the bits they've put on OpenSolaris over the last week or two. I've been impressed with how much they've got working so far, but as an experienced Solaris user who is new to Xen I've found it quite hard to get my head around what Xen does and how it works.

First off install it using the instructions on OpenSolaris and then you need to get some domains set up.

So boot up under Xen. You'll see from the grub menu (/boot/grub/menu.lst) some things have changed. Instead of booting a Solaris kernel, you boot xen which then loads a solaris kernel in the module line.

#Solaris on Xen 64bit
title Solaris on Xen 64-bit
kernel /boot/amd64/xen.gz dom0_mem=524288 console=com1 com1=9600,8n1
module /platform/i86xen/kernel/amd64/unix /platform/i86xen/kernel/amd64/unix -k
module /platform/i86pc/boot_archive


Note the option dom0_mem=524288. This assigne 512Mb to your Dom0 at startup.

Another thing to note is that even booting on metal (ie not booting the xen hypervisor first) we no longer use multiboot, but can boot our unix directley.

#---------- ADDED BY BOOTADM - DO NOT EDIT ----------
title Solaris Nevada snv_41 X86
kernel /platform/i86pc/kernel/amd64/unix
module /platform/i86pc/boot_archive
#---------------------END BOOTADM--------------------


Read Joe Bonasera's blog about why this is the case.

OK so you've got Solaris booted on Xen. This is refered to as Dom0 you need to set up some more instances of Solaris (or whatever Xen compatible OS you happen to want) and that is refered to as a DomU.

Each DomU has it's own full OS install so for solaris we set up a flar archive and use the vbdcfg script to help us convert that in to a OS instance Xen can boot as a DomU. This is all described here.

A couple of things to think about. Networking is bridged (ie it appears to be directly connected to the outside world) so you're going to need to give it a real IP address or use DHCP (this can be decided as the DomU boots up). Also you need to give it an ethernet address using the -e flag to vbdcfg. As this is a made up ethernet addess I don't know how you're supposed to create it, Dave Edmondson suggested making the first octet 0xaa and encoding the IP address in the rest of it.

So when you've got your DomU setup using vbdcfg what do you get?

Well in /export/xc/xvm you'll have a directory for your DomU domain

$ ls -l /export/xc/xvm/mydomU
total 10
-rw-r--r-- 1 root root 379 Jul 12 09:57 mydomU-64.py
-rw-r--r-- 1 root root 367 Jul 12 09:57 mydomU.py
drwxr-xr-x 4 root root 512 Jul 12 09:48 platform
-rw-r--r-- 1 root root 19 Jul 12 09:48 root.dev
drwxr-xr-x 2 root root 512 Jul 12 09:48 vmnt


the .py files are python scripts used to start the domain, platform is the directory where the kernel to boot is, root.dev contains the name of the root device and vmnt is where the domain can be mounted using

$ vbdcfg mountdomU domU-name


But you don't want to do that now. You want to start up your domain, So this is how you do it

$ xm create mydomU-64.py
(this starts mydomU in 64bit mode)
$ xm console mydomU
(puts you on the console)


or alternatively

$ xm create -c mydomU-64.py


which starts the domU and puts you directly on the console - useful sometimes to find out why your domain isn't coming up

So at that point you on the console of your domain and it's coming up just as if its a fresh install of Solaris.

In future "Xen for Dummies" installments I'll show how to configure the network, add disks, create more cpus in the domU than you have in the real box! and I'm sure I can think of more later.

Xen

Technorati Tags: Solaris OpenSolaris Xen

Thursday Jul 06, 2006

Another Dtrace Customer presentation

I had another opportunity to talk to a customer (this time a law enforcement agency) about Dtrace last week. This time it was only a 1/2 hour slot and we had already over run by quite a way so the Presentation was a lot shorter You'll notice the similarity with the previous presentation (well there is no point in reinventing the wheel) In this case the customer was very interested in virtualization. They have a large farm of PCs which are only ~15% utilized. They felt something like Xen or vmware might be able to help bring that up. I'm working with the Xen team right now just getting up to speed on the technology. It looks pretty impressive. Go and check out the Xen Opensolaris community it's pretty interesting now and I'm sure it'll have a load more interesting stuff on it soon Technorati Tags: Solaris OpenSolaris dtrace Xen

Wednesday Jun 14, 2006

dtrace presentation

I had the pleasant opportunity of visiting an exisiting Sun customer last week who was interested in dtrace and containers. I've been using dtrace since before it was in Solaris 10. Using the bits the development team were working on. In my line of work it's a great boon and I think the customer could see the benefits of the tool and appreciated the insights of someone who used it day in day out. Obviously in an hour or so presentation you can't cover anything in great detail, but I wanted to whet their appettites as to things dtrace could do for them. So I walked them through the reasons why it's a good solution, and the architecure. Then a few examples and demos. Then we talked about how it could solve some of the problems they've experienced. Over all a very good experience and I believe they found it useful. Any way I thought I'd put up the presentation here to remind me that I did it and so someone else might find it useful especially in now there is such a community on opensolaris.org Presentation Technorati Tags: Solaris OpenSolaris dtrace

Tuesday May 30, 2006

What does ISSIG(FORREAL) do.

After my last blog I thought I could expand a bit on what ISSIG(FORREAL) does. That is: o stop for some other thread to do fork1(); return 0 when it's done. o stop because a job control stopping signal is present; return 0 after the process is continued with SIGCONT. o stop because /proc requested the thread or process to stop; return 0 when /proc continues the thread/process. o stop for watchpoint manipulation; return 0 when that's done. o stop because of a non-blocked pending signal that /proc cares about; return 0 if /proc cancels the signal when continuing the thread. o discard pending but ignored signals; return 0 when that's done. o Otherwise return 1 if there is still a signal or 0 if not. If ISSIG(FORREAL) returns non-zero, then there really is a non-ignored signal present so you really must take appropriate action (usually returning back to userland with EINTR). Technorati Tags: Solaris

Friday May 26, 2006

ISSIG() How do I use it?

As indicated in this blog entry a fix I did for a bug showed up some interesting and rather hard to diagnose problems. These showed up as the getting EACCES (permission denied) errors when trying to cd or read an automounted directory. Now when you use dtrace to work back through what has happened you find that Using truss and dtrace it was possible to see that the program that was getting EACCES was doing an open on the directory which ended up in auto_wait4mount(). It is this call that returns EACCES. At around the same time the automountd process gets an interrupted system call (through a variety of calling paths but usually from nfs4_mount() > eventually calling > nfs4secinfo_otw() > nfs4_rfscall() which returns EINTR indicating it's getting something like a signal. This includes watchpoint activity and fork1 requests. In this case it is a fork1 request so the thread requesting the stop is in the same processes, so ISSIG(JUSTLOOKING) returns true even if lwp_nostop is set. (this was the fix for 4522909). If we look in nfs4_rfscall() at the following section
   1334                 /\*
   1335                  \* If there is a current signal, then don't bother
   1336                  \* even trying to send out the request because we
   1337                  \* won't be able to block waiting for the response.
   1338                  \* Simply assume RPC_INTR and get on with it.
   1339                  \*/
   1340                 if (ttolwp(curthread) != NULL && ISSIG(curthread, JUSTLOOKING))
   1341                         status = RPC_INTR;
   1342                 else {
   1343                         status = CLNT_CALL(client, which, xdrargs, argsp,
   1344                             xdrres, resp, wait);
   1345                 }
Here we look to see if there is a signal pending using ISSIG(curthread, JUSTLOOKING) to optimise out a CLNT_CALL() if its not needed. If so we return RPC_INTR (a little further down). The assumption is that if you have any signal like activity you need to return to userland to handle the signal. This is not the case for fork1() you can simply wait till you start running again and carry on. ISSIG(t, FORREAL) could be used to check if there is a real need to return to userland. The trouble is you need to drop all your locks before calling it. So you then have to reaquire the locks later. This may require you restart the rfscall operation. Also if you do a forkall() (ie a normal fork() systemcall) you do need to return to userland with EINTR, same with some /proc activity. So it's probably worth checking for that prior to calling the issig(FORREAL). A good example of how to do this is in cv_wait_sig(). So an example of how to correctley check for signal delivery in a system call (say if you are going to do something that takes a long time and don'r want to waste that activity if it's going to be interrupted) would be.
 .        if (lwp != NULL &&
 .            (ISSIG(t, JUSTLOOKING) || MUSTRETURN(p, t))) {
 .                ... drop all of your locks! ...
 .                if (ISSIG(t, FORREAL) ||
 .                    lwp->lwp_sysabort ||
 .                    MUSTRETURN(p, t)) {
 .                        lwp->lwp_sysabort = 0;
 .                        return (set_errno(EINTR));
 .                }
 .                return (set_errno(ERESTART));
 .        }
I'll be applying this approach to nfs shortly Technorati Tags: Solaris

Wednesday May 24, 2006

Why have I got two automountd processes?

I've been working for ages on how to resolve an issue reported as automountd hangs when using executeable automount maps. This is logged as bug 4522909. The problem that when the automountd attempts to do a mount, it triggers a lookup on the mountpoint. This is done by another thread in the automountd. While we're waiting for that to complete we call auto_wait4mount which in turn blocks all signals by calling signintr(). This also makes the thread unstoppable by incrementing lwp_nostop. As this is an executeable map, the "other thread" has to fork1() in order to run the map. This in turn tries to STOP all threads in the process to get them in to a known state before forking the 1 thread we care about. As the mount thread is unstoppable this never completes. The fix was to allow a thread to be stopped, even if lwp_nostop is set, if the thread stopping it is in the same process. However this has show up a couple of mistaken assumptions in NFS land which mean that more work is needed there to allow an RFS call to be restarted (6306343.) This is obviously a bit of a pain as the changes required to fix 6306343 may be quite large and require a change that is too risky for a patch so an alternative approach is needed. After a good amount of discussion we concluded the lowest risk solution was to create a door server for the automountd process to talk to. This door server would handle all requests to fork the process to do an exec. Obviously as the lookup is handled in the main automountd process the deadlock is avoided. Now if you read the door_call man page it talks extensively about attaching to the door_server file descriptor as there is an assumption that the fd is somewhere in the file name space. If it is then anyone with sufficient privilege can write to it and you can end up with all sorts of rubbish written in. However if you fork() a child inherits the fs'd of the parent so the simple sollution is to have the automountd process set up the door_server() itself before it becomes multithreaded and then create a child to behave like the old automountd did. But with calls to the door_server to get the fork()/exec() stuff to work. Hence you get two automountd processes. Unfortunately this is all in the patch releases as it isn't needed for OpenSolaris, so you'll never see my excellent code, but I still thought it was worth writing it up. I'll fill in the patch versions as they come out So far we have Solaris 8 108994-56 SunOS 5.8_x86: LDAP2 client, libc, libthread and libnsl libraries patch 108993-56 SunOS 5.8: LDAP2 client, libc, libthread and libnsl libraries patch Solaris 9 117468-12 SunOS 5.9_x86: nfs patch 113318-26 SunOS 5.9: nfs patch Solaris 10 T118833-18 SunOS 5.10: Kernel Update T118855-15 SunOS 5.10_x86: Kernel Update Technorati Tag:
Technorati Tag:

Tuesday Oct 18, 2005

So what the heck is anonymous memory

As part of some work I've been doing I've had to talk a lot about anonymous memory. So I thought I'd write it down while I remember what its all about. anonymous memory is memory that is managed by segvn but is not really directly associated with a file. It's used for things like process stacks, heap, or COW (copy on write) pages. A good example of this is if you fork a process. All the addresses in the second process actually map back to the same bits of physical memory (the same pages). However if youre child process was then to do something different with the memory (eg. the child went off and manipulated an array in memory) the VM subsytem would copy those pages and change the mappings in the child process to point to the new pages. This new memory would be anonymous memory, and the child process would merrily make the changes to the array, unaware it now had new "physical" memory it was talking to. In a bit more detail, you've probably heard people talking about "anon_maps" and "vnode, offsets". Or at least if you're interested in VM and have been trawling the code you probably have. So I'll try and put a simplistic view of what they mean. Typically when you mmap a file, you are give it a vnode, offset and length. Segvn manages mmaped files and stores this in its private data structure. So when you walk through the address space of the process looking for a virtual address, you'll see that the segment contains that address and be able to map it to the vnode and where in that file using the offset. As mentioned above anonymous memory is not associated with files, but with perhaps swap (you'd get to uses anon memory if you mmaped /dev/zero BTW). So the purpose of anonymous memory or the anon layer is to fake up a vnode and offset for segvn to find the data on a swap device (actually by going through swapfs which I haven't yet looked at). The core of the anon layer is the anon_map structure. This is stored in the segvn_data structure and points you to the anon_hdr for this segment. Each anon_hdr is linked to an array on anon structures which are the swapfs implementation of how to find a vnode and offset on a swap file system (or it might be on a real swap device if you bypass swapfs, I think I'll need another blog entry for swapfs after this). So after some swapfs magic we can find the backing store for this anonmap. You might have noticed a bit if handwaiving in the middle there. What was that about an array of anon structures? Why do we need one of them? Well each anon strucure represents a page of memory. Our segment may be more than one page in size (try pmap $$ and you may well see some anon segments of more than 8K on sparc) so would need more than one of these anon structure to describe it. So we have this array. Technorati Tag:
Technorati Tag:
About

Chris W Beal

Search

Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today