Tuesday Jun 16, 2009

Back to Parallel Patching for Solaris 10

In my previous entry Parallel Patching in Solaris 10 I mentioned that the patches for this would be released before the end of June, these should be available on SunSolve from tomorrow (June 17th), feature is contained in the latest Solaris 10 patch utilities patch, 119254-66 (SPARC) and 119255-66 (x86).

This is available for use on all Solaris 10 systems. 

Simply install this patch, set the maximum number of non-global zones to be patched in parallel in the config file /etc/patch/pdo.conf, and away you go.

Prior to this feature, each non-global zone was patched sequentially, leading to unnecessarily long patching times for zones systems.

With this feature invoked, the global zone continues to be patched first, but then the non-global zones can be patched in parallel, leading to significant performance gains in patching operations on Zones systems.

While the performance gain is dependent on a number of factors, including the number of non-global zones, the number of on-line CPUs, the speed of the system, the I/O configuration of the system, etc., a performance gain of ca. 300% can typically be expected for patching the non-global zones - e.g. On a T2000 with 5 sparse root non-global zones.

Here's the relevant note from the patch README file:

NOTE 10: 119255-66 is the first revision of the patch utilities to deliver "zones parallel patching". 

         This new functionality allows multiple non-global zones to be patched in parallel by patchadd.   Prior to revision 66, patchadd would patch all applicable non-global zones sequentially, that is one after another. With zones parallel patching, a sysadmin can now set the number of zones to patch in parallel in a new configuration file for patchadd called /etc/patch/pdo.conf.

         The two factors that affect the number of non-global zones that can be patched in parallel are

         1. Number of on-line CPUs
         2. The value of num_proc in /etc/patch/pdo.conf

          If the value of num_proc is less than or equal to 1.5 times the number of on line CPUs, then patchadd limits the maximum number of non-global zones that will be patched in parallel to num_proc. If the value of num_proc is greater than 1.5 times the number of on line CPUs, then patchadd limits the maximum number of non-global zones that will be patched in parallel to 1.5 times the number of on line CPUs.  Note that patchadd will patch all applicable non-global zones on a system, the above description outlines only how patchaadd determines the maximum number of job slots to be used during parallel patching of non-global zones.

          An example of this in operation would be where:
          num_proc=10
          number of non-global zones=5
          and number of on line CPU's is 32 ( assume a T2000 here )

          In this case the maximum setting for num_proc would be 48, but as the number of non-global zones=5, then num_proc will be set to 5.

          Please see comments in /etc/patch/pdo.conf for more details on setting num_proc.

Bigger than 1Tb spindles

Seems a strange title for a blog entry, but I'll explain. As many of you are aware the current Solaris 10 installer is getting somewhat old and painful to use, in fact it is really beyond its useful life.

This is one of the many reasons why in OpenSolaris you have not just IPS but a new installer as well, far more akin to what you see on other operating systems today.

As I discussed in my blog entitled Parallel Patching for Solaris 10, the work we do in this space is highly targeted as it is high risk.

Solaris 10 Update 8 when it ships later this year will have some more enhancements in this space as I've already mentioned. We recently made some more changes in this area, specifically to allow Solaris to work with boot disks of greater than 1TB.

Now this may not sound like such a big deal but in the context of when the installer was designed for Solaris it cannot be underestimated.

So what have we done and why?

The current disk labeling scheme in Solaris (vtoc inside fdisk) breaks down past 1Tb for bootable disks. With Solaris 10 Update 8 not only have the OS level changes been made to allow x86 and SPARC based system to boot from disks up to 2Tb but install changes have also been made to allow the Solaris 10 installer to handle these disks.



Monday Jun 15, 2009

Configuring an auto install client / server setup using VirtualBox

During one of my presentations @ CommunityOne I demonstrated on my laptop an auto install setup in VirtualBox. I had an auto install server running OpenSolaris 2009.06 and used that to "jumpstart" (to use a Solaris 10 familiar term) a client.

I was asked to write this and committed to that asap after @ CommunityOne,so here goes...

1. Install 2009.06 in VirtualBox

2. Create a 2nd network adapter for the virtual machine, under settings, select network, adapter 2, enable it and then where it says attached to select "internal network" (you'll need to shutdown the virtual machine to do this if you didn't do it before installing).

3. Inside the virtual machine install the autoinstaller package:

pkg install SUNWinstalladm-tools

NOTE:

You need to have root priviliges to do everything from 3. onwards, best way is to pfexec bash

4. Now configure the network connections on the virtual machine you just installed

Adapter 1 is e1000g0 and will be dhcp by default, leave that alone

Adapter 2 is e1000g1 and will need to be configured, specifically:

Netmask: 255.255.255.0
IP:      192.168.2.50
Broadcast: 192.168.2.0

So first off edit /etc/hosts, adding in:

192.168.2.50    name_of_the_machine
192.168.2.60    aiclient0
192.168.2.61    aiclient1
192.168.2.62    aiclient2
192.168.2.63    aiclient3

Add a line entry aiclient<number> for every machine you'd like to auto install. You can pick a whatever name you want here for each one, I just used aiclient to make it easy to understand.

name_of_the_machine is the name of the machine you installed in step 1, by default this is "opensolaris".

Also in /etc/hosts you have the line:

127.0.0.1 opensolaris opensolaris.local localhost loghost

Delete the first opensolaris entry so the line looks like:

127.0.0.1 opensolaris.local localhost loghost

Now edit /etc/hostname.e1000g1, assuming you entered 192.168.2.50 as the internal network address of your guest earlier enter that into this file.

Now edit /etc/netmasks, adding in the line:

192.168.2.0    255.255.255.0

Now check and modify the status of your network/physical smf services:

guest@opensolaris:~# svcs -a | grep network/physical
disabled       12:36:58 svc:/network/physical:default
online         12:37:02 svc:/network/physical:nwam
guest@opensolaris:~# svcadm disable /network/physical:nwam
guest@opensolaris:~# svcadm enable /network/physical:default
guest@opensolaris:~# svcs -a | grep network/physical
disabled       13:20:57 svc:/network/physical:nwam
online         13:21:20 svc:/network/physical:default

Configure the e1000g1 interface:

ifconfig e1000g1 inet 192.168.2.50 netmask 255.255.255.0 broadcast 192.168.2.0
ifconfig e1000g1 up

Configure the e1000g0 interface:

ifconfig e1000g0 dhcp

5. Start the auto install service, run the command:

installadm create-service -n 0906x86 -i 192.168.2.60 -c 4 -s /images/osol-0906-111b2-ai-x86.iso /export/aiserver/osol-0906-ai-x86

NOTE:

The assumes you've put the iso image in /images and named it osol-0906-111b2-ai-x86.iso
The -c option is the number of clients you've configured in /etc/hosts, in this example it is 4
The -i option is the address of the first client you've configured

You should see:

Setting up the target image at /export/aiserver/osol-0906-ai-86 ...
Registering the service 0906x86._OSInstall._tcp.local
Creating DHCP Server
Created DHCP configuration file.
Created dhcptab.
Added "Locale" macro to dhcptab.
Added server macro to dhcptab - opensolaris.
DHCP server started.
Unable to determine the proper default router
or gateway for the 192.168.2.0 subnet. The default
router or gateway for this subnet will need to
be provided later using the following command:
   /usr/sbin/dhtadm -M -m 192.168.2.0 -e  Router=<address> -g
Added network macro to dhcptab - 192.168.2.0.
Created network table.
adding tftp to /etc/inetd.conf
Converting /etc/inetd.conf
copying boot file to /tftpboot/pxegrub.I86PC.OpenSolaris-1
Service discovery fallback mechanism set up

Verify the service creation has worked:

Start up firefox and go to http://localhost:5555 it should show an 'Index of /' page

6. Now setup dhcpmgr and ipv4

run dhcpmgr

Select macros and double click on the dhcp_macro_0906x86 to bring up a macro window and add the following macros:

Router with the value of the ip address of this machine (192.168.2.50)

DNSserv with the value of the IP address in /etc/resolv.conf (or one of them if more than one is listed)

NOTE:

DNSserv will need to be \*changed\* when you move the system around to the current value in /etc/resolv.conf

8. Fix up the server to allow for IPv4 forwarding:

routeadm -e ipv4-forwarding
routeadm -u

9. Create your client machine in VirtualBox

\* Give it a hard disk of 16G or more
\* Set the boot order to be network first (Settings->General->Advanced)
\* Set the network to be internal (in the same way as you setup the e1000g1 interface earlier, using Settings->General->Network)

10. Startup the client

You'll get the PXE boot, a dhcp address, then the grub menu for 2009.06 with a single line and then you'll get the screenshot you see below:



11. When it is installed

You'll get a successful completion messages and if you look at the /tmp/install_log file the end will look like this:



To use your newly installed image shut it down, then go and change your network boot priority or remove (deslect) it completely (Settings->General->Advanced).

Once you have done that you can boot your newly installed machine.

The install log can be found in /var/sadm/system/logs/install_log on the installed client machine.

Note: Thx goes to Pete Dennis on my team for working with me on this.

Sunday May 31, 2009

CommunityOne, distro constructor and laptop migration

I'm on the road again, actually this is my second trip since the one I wrote in China a few weeks ago, since then I've done a week in California, a couple of weeks @ home and I'm now back on the VS19, this time for CommunityOne and JavaOne and the launch of OpenSolaris 2009.06 on Monday at CommunityOne.


I've spent the last couple of weeks writing slides (off and on) and setting a laptop up for a demo at CommunityOne I'm planning to give on Tuesday, the talk is entitled "Deploying OpenSolaris in your DataCentre" as part of that I'm planning to demo at least one of the features that makes OpenSolaris so effective in a datacentre environment.

The distro constructor, put simply it allows you to build a custom image (very much like the OpenSolaris livecd) as either an iso or a usb and then with OpenSolaris 2009.06 you the automated installer technology, which is new, you can then take that iso image and "jumpstart" (for those of you familiar with earlier versions of Solaris) and apply that to machines across your enterprise. Knowing that each one of them is installed absolutely the same.


Its a really easy process, in fact it is so easy an executive can do it...the example below is one I ran on my workstation on Friday.

cove(5.11-64i)$ pfexec pkg install SUNWdistro-const
DOWNLOAD                                    PKGS       FILES     XFER (MB)
Completed                                    1/1       75/75     0.19/0.19

PHASE                                        ACTIONS
Install Phase                                104/104
PHASE                                          ITEMS
Reading Existing Index                           8/8
Indexing Packages                                1/1
cove(5.11-64i)$ cd /usr/share/distro_const/slim_cd
cove(5.11-64i)$ distro_const build ./slim_cd_x86.xml
cove(5.11-64i)$ pfexec distro_const build ./slim_cd_x86.xml
/usr/share/distro_const/DC-manifest.defval.xml validates
/tmp/slim_cd_x86_temp_6104.xml validates
Simple Log: /rpool/dc/logs/simple-log-2009-05-29-14-41-39
Detail Log: /rpool/dc/logs/detail-log-2009-05-29-14-41-39
Build started Fri May 29 14:41:39 2009

Two hours later

==== usb: USB image creation
/dev/rlofi/2:    1723200 sectors in 2872 cylinders of 1 tracks, 600 sectors
  841.4MB in 180 cyl groups (16 c/g, 4.69MB/g, 2240 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
32, 9632, 19232, 28832, 38432, 48032, 57632, 67232, 76832, 86432,
1632032, 1641632, 1651232, 1660832, 1670432, 1680032, 1689632, 1699232,
1708832, 1718432
1435680 blocks
Build completed Fri May 29 17:04:47 2009
Build is successful.

Other stuff to know about...

This will result in two image files:

/rpool/dc/media/OpenSolaris.iso
/rpool/dc/media/OpenSolaris.usb

The manifest is an XML file that describes what to build and comes with two examples:

/usr/share/distro_const/slim_cd/\*.xml
/usr/share/distro_const/auto_install/\*.xml

Now auto-installing it is something I'm still coming to terms with, I have a recipe, but ran out of time to set it up before I boarded the plane, so that'll be one I'll have to try when I get to San Francisco later today. Assuming all goes well you can come and see me demo autoinstalling from one VirtualBox guest to another on my Toshiba R600 running OpenSolaris 2009.06, if not it'll be just the distro constructor. Either way as soon as I crack auto install I'll post a blog about it.

Of course VirtualBox 2.2.4 is also now available, just released over this weekend, something else I need to do, upgrade my 2.2.2 image before Tuesday (whilst putting this blog into the editing tool and loading up the pictures I let the upgrade run in the background, 16 minutes inc. a 74MB download, the wonders of IPS).

Just to top it all this last week I managed to "break" my R600 :-( well I bent the power socket on the laptop so that when I plugged the mains supply in it wobbled and it was only a matter of time before the connector itself came off the motherboard so I also had to migrate from one R600 to another.

Laptop migration particularly when you were running multiple guests used to be a hugely painful experience, well in an OpenSolaris 2009.06 world it got a lot simpler (OK with a bit of help from VirtualBox as well :-) .

Firstly just copying your home environment off to something and then back, well USB sticks are slow and Solaris interop to something link a dlink NAS box also used to be difficult, well now anymore. At home I have a number NAS boxes, primarily for backup off the computers in the house which my family use, but also to serve out DVD iso images, music, video and recorded tv and using those in now just got a whole lot easier.

For those of you with OpenSolaris goto Places -> Computer -> Network, it'll then highlight a Windows network (which is how these commodity NAS boxes appear) and you can then simply connect to the appropriate volume and "bingo" you have a store to copy your home environment onto.



So just for my home environment (without any virtual machines) I was looking at at least 8G of data, well a copy off one laptop to the dlink box and back to my new laptop, took under 20 minutes, I'm sure having a gig network helped, but even so it was really "drag and drop" style simple.

Then onto VirtualBox, 2.2.2 has a great feature, File -> Export Appliance and File -> Import Appliance, which basically allows you to save off a virtual machine and import it somewhere else. The great part is it takes care of all the config file stuff which used to make this painful and of course copying them between machines, which effectively backs them up at the same time, is simple as well.



All in all a painless experience and so simple my kids can do it as well.

Wednesday May 27, 2009

Parallel Patching for Solaris 10

One of the things I said I'd try and write about are various features that you will see upcoming...and I've kind of touched on this before in the "What, no tornados?" entry, specifically Parallel Patching.

This functionality went back into build 1 of what will become Solaris 10 Update 8 later this year, so what was the problem statement and what does the project do?

When zones were introduced in Solaris 10 they stretched patching of a system to the limits. The original patching system had been designed to patch the global zone and then all non global zones sequentially. Given all the performance overheads in patching a system with a large number of zones could take 10+ hours to complete a full patching window often the window could be stretched to 30+ hours depending on the number of zones installed on a system.

The solution was to remove the sequential nature of patching. In the solution in place under the "Parallel Patch Project" a Global Zone is patched then Non Global Zones are patched in parallel. The degree of parallelization is determined by a new configuration file. The overall performance gains we saw in testing are from around 20 hours down to 6.5 hours with a value of 14 for the number of parallel patch invocations. The value 14 was based on the 14 non global zones the system had installed.

Now the really good news is that the project team managed to work their magic and you'll be able to get this functionality in the patchadd patch which you can apply to an existing Solaris 10 installation. It'll be available in the late June timeframe allowing you to take advantage of this now as opposed to waiting for the update release to ship towards the end of this year.

Whilst this may not seem like a big deal the customer impact of the current functionality limitations cannot be underestimated. I remember having a discussion with one customer after they had taken a 23 hour outage and the 200+ zones they had were still not patched.

On top of this, changes like this have to go into what we call the "install" consolidation, it contains the installer and patching to name but two a lot of which dates back to SVR4 and often earlier. The code itself is fragile and any changes have to be very carefully managed, as the risk of breakage and regressions is high. One of the reasons I've got a long list of changes I'd like my team to make to improve the customer experience in this space, but why it takes us time to get them out the door. Yet another reason why OpenSolaris and Solaris.Next will have a new installer and IPS.

Solaris 8 Vintage Support

As well as providing the sustaining engineering for the newer versions of Sun's products we also get to manage them once development really stops and they eventually go into EOL and finally EOSL, Solaris 8 recently went from what is known as phase 1 into phase 2 support, in fact in the last two months, on April 1st.

When Solaris reaches this phase in its lifecycle we offer what is called a "Vintage Patch Service". I often get asked when Solaris reaches this stage in its lifecycle what does it all mean and why.

Solaris 8 first shipped in February 2000 and was in the planning for a few years before that, Sun has released two versions of Solaris since then and you can already see where Solaris is heading by using OpenSolaris (which incidentally is what I'm running on my laptop I'm typing this on, but that is another blog entry).

Like all things Solaris 8 is starting to reach its limits and in some cases is being pushed beyond its design limits, which results in all sorts of performance impacts in the field, it also does not support our latest generation of HW platforms, such as the recently announced Nehalem chipset from Intel and our Niagara architecture in any form.

The hardware which Solaris 8 was designed to run on is also approaching EOL to varying degrees, plus in these economic times Sun has hardware like the aforementioned Niagara based machines that bring many benefits.

It is kind of like owning a car, you keep it and keep it and you keep spending money on it, but eventually you get to a point in time where you need to do that major upgrade and all of sudden you have a new car with a manufacturers warranty and your saving money on that, running costs etc.

Sun recognises the impact of such major upgrades and provides many things to make everyones life easier Solaris 8 containers for example and our Application Binary Guarantee which seems to be one of our best kept secrets. In simple terms what it says is that if you follow the published stable interfaces and write your application to them if you compile it on that release of Solaris you can pick up your application and run it on later releases with no recompilation necessary. The legal boys will now tell you it has some caveats, and it does but I've seen great success where people have "just done it" and I cannot think of a case where it has failed if people have followed the rules.

I've also seen a number of our large accounts use the Solaris 8 container technology and move everything from a Solaris 8 machine and pick it up and put onto a newer platform running Solaris 10 under the hood and in many cases with multiple machines now consolidated onto one newer platform.

The cost benefit in doing this was described to me by one customer like this..."I need to spend $ on new applications but I have an ever shrinking IT budget. By using Solaris 8 containers I was able to consolidate my existing estate (a lot of E450 type machines) onto the T series platforms and with the savings I made on HVAC plus maintenance cost reductions I had the $ I needed to re-invest in application development and deployment." which is great and especially as that was one of the reasons we designed the product and it does as they say "exactly what it says on the tin".

I also get asked why you have to pay extra to keep getting engineering support on Solaris 8 in phase 2, it is simple really, Solaris keeps marching on and otherwise we'd move that resource onto developing the next generation of products. Again back to the car analogy as parts become in short supply and with continuing demand the price goes up.

I've worked with a lot of customers over the last 9 months since we announced this programme and a lot of account teams, the business arguments for migrating are compelling even for an engineering guy and even more compelling in these economic times.


Friday May 22, 2009

CommunityOne an easy way to register for free deep dives

We've had a number of requests on how to make it easy to register for the free deep dives I mentioned in my previous entry, well here you go, we look forward to seeing everyone June 1st week, so go on register it is free.

Online Event Registration - Powered by www.eventbrite.com

Whilst your signing up for this why not also register for the OpenSolaris Ignite Newsletter.

CommunityOne

The week of June 1st is CommunityOne or CommunityOne West to give it its full title. It starts on Monday at Moscone and runs parallel with JavaOne on Tuesday and Wednesday. CommunityOne itself moves to the InterContinental from Tuesday and JavaOne runs at Moscone. Just in case you could not get enough of all this great technology the the Open HA Cluster Summit starts on the Sunday at the Marriott, registration link for these part is available on the agenda page as well.

The full agenda for CommunityOne is here. The OpenSolaris deep dives are FREE, when you register for CommunityOne, use the promotional code OSDDT and you will not be charged. This code will NOT get you into the other deep dive tracks. For those of you that have already registered to add a deep dive session to your registration, please call the CommunityOne Hotline at 1-866-405-2517 (U.S. and Canada only) or +1-650-226-0831 (international).

On the Tuesday you'll see John Fowler, Executive Vice President of Systems for Sun officially launch OpenSolaris 2009.06, I spent sometime earlier this week installing the final release candidate on 3 machines on the metal (my office and home workstation and my R600 laptop - which is what I'll be using to present from at CommunityOne), plus I did a couple of installs in a virtual world using the latest version of VirtualBox running on Vista 64bit (with a 64 bit OpenSolaris guest). Then of course I gave the CD to my children and told them to install it :-)

More to come on 2009.06 the week of CommunityOne I may even sneak in some screen shots and other points of note before then, assuming marketing of course are not reading my blog...




About

Chris is the Senior Director of Solaris Revenue Product Engineering

Search

Categories
Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today