« February 2009 | Main | April 2009 »

March 2009 Archives

March 4, 2009

Need shared storage fast ? use the Linux Target Framework


For all of us that need (shared) (iSCSI) storage for test or education purposes and don't want to install for example OpenFiler (which still is a great solution), there is now the Linux Target Framework (tgt).

In short, tgt consists of a deamon and utilities that allow you to quickly setup (shared) storage.
Tgt can be used for more, however my example is purely focused on setting up shared iSCSI storage.

First, install the tgt software, this is available in Oracle Enterprise Linux 5.

[root@gridnode05 tmp]# rpm -i scsi-target-utils-0.0-0.20070620snap.el5.i386.rpm 


Then, start the tgtd deamon.

[root@gridnode05 tmp]# service tgtd start
Starting SCSI target daemon: [  OK  ]


Export a new iSCSI target

[root@gridnode05 tmp]# tgtadm --lld iscsi --op new \ 
--mode target --tid 2 -T 192.168.200.173:rkvol


Create storage to export from. Let's make it 100MB in size.
This will be the actual storage that the initiator will see.
In normal situation you should use a normal block or a lvm

[root@gridnode05 tmp]# dd if=/dev/zero of=/scratch/rk.vol bs=1M count=100
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 0.367602 seconds, 285 MB/s


Add the "storage volume" to the target:


[root@gridnode05 tmp]# tgtadm --lld iscsi --op new --mode logicalunit \
--tid 2 --lun 1 -b /scratch/rk.vol


Allow all initiator clients to use the target:


[root@gridnode05 tmp]# tgtadm --lld iscsi --op bind --mode target --tid 2 -I ALL



On the client install the iSCSI initiator:


[root@gridnode03 ~]# rpm -i /tmp/iscsi-initiator-utils-6.2.0.868-0.7.el5.i386.rpm


After installation, start the service iscsi

[root@gridnode03 ~]# chkconfig iscsi on

[root@gridnode03 ~]# service iscsi start
iscsid is stopped
Turning off network shutdown. Starting iSCSI daemon: [  OK  ]
[  OK  ]
Setting up iSCSI targets: iscsiadm: No records found!
[  OK  ]


Discover the iscsi device:


[root@gridnode03 ~]# iscsiadm -m discovery -t sendtargets -p 192.168.200.175
192.168.200.175:3260,1 192.168.200.173:rkvol


Restart the iscsi service and notice the target coming in:


[root@gridnode03 ~]# service iscsi restart
Stopping iSCSI daemon: /etc/init.d/iscsi: line 33: 29176 Killed /etc/init.d/iscsid stop
iscsid dead but pid file exists
Turning off network shutdown. Starting iSCSI daemon: [ OK ]
[ OK ]
Setting up iSCSI targets: Logging in to [iface: default,
target: 192.168.200.173:rkvol, portal: 192.168.200.175,3260]
Login to [iface: default, target: 192.168.200.173:rkvol,
portal: 192.168.200.175,3260]: successful
[ OK ]



See the block device coming in, in the messages file


[root@gridnode03 ~]# tail -f /var/log/messages
Mar 4 13:24:19 gridnode03 last message repeated 2 times
Mar 4 13:24:19 gridnode03 iscsid: connection1:0 is operational now
Mar 4 13:24:19 gridnode03 kernel: SCSI device sda: 204800 512-byte hdwr sectors (105 MB)
Mar 4 13:24:19 gridnode03 kernel: sda: Write Protect is off
Mar 4 13:24:19 gridnode03 kernel: SCSI device sda: drive cache: write back
Mar 4 13:24:19 gridnode03 kernel: SCSI device sda: 204800 512-byte hdwr sectors (105 MB)
Mar 4 13:24:19 gridnode03 kernel: sda: Write Protect is off
Mar 4 13:24:19 gridnode03 kernel: SCSI device sda: drive cache: write back
Mar 4 13:24:19 gridnode03 kernel: sda: unknown partition table
Mar 4 13:24:19 gridnode03 kernel: sd 0:0:0:1: Attached scsi disk sda



Verify the size of the block device

[root@gridnode03 ~]# fdisk -l /dev/sda

Disk /dev/sda: 104 MB, 104857600 bytes
4 heads, 50 sectors/track, 1024 cylinders
Units = cylinders of 200 * 512 = 102400 bytes

Disk /dev/sda doesn't contain a valid partition table
[root@gridnode03 ~]#

This seems a very neat utility to use in order to obtain shared storage for education, or testing purposes.

Rene Kundersma
Oracle Expert Services, The Netherlands

March 11, 2009

Provisioning your GRID with Oracle VM Templates

Introduction (Chapter 1)

Linux node installation and configuration (virtualized or not) for an Oracle Grid environment can be done on various ways. Of course, one could do this all manually, but for the larger environments this would of course be undo able.

Also, you want to make sure each installation has the same specifications, and you want to be sure human errors that may occur during the installation are brought back to a minimum.

This blog entry will have chapters in which all details of an automated Oracle VM cloning process will be described.

The setup as described below is used to prepare education environments. It will also work for proof of concept envrionments and most parts of it may be even usable in your own Grid deployment strategy.

The setup described allows you to setup an GRID environments that students can use to learn (for instance) how to install RAC, configure DataGuard, work with Enterprise Manager Grid Control. I can also be used to learn students how to work with Swingbench or FCF all within their own infrastructure.

This virtualized solution help to quickly setup, repair, catch-up, restore and adapt the setup. It will save your IT department costs on hardware and storage and it will save you lots of time.

The pictures on this page are best viewed with Firefox.

Bare metal provisioning

Within the Oracle Grid, Oracle Enterprise Manager Grid Control release 10.2.0.4 with kickstart and PXE-boot is used more often these days as a way to do a so called "bare metal" installation of the OS: kix.gif

After this bare metal installation "post configuration scripts" took care of the node specific settings.

Even with the use of Oracle Virtual Machines on top of such a node, the kickstart procedure can still be used; without too much effort a PXE-boot configuration for virtualized guests can be setup.

This way of "bare metal installation" or better "virtual metal installation" by PXE-boot for VM Guests is a nice solution, which I will describe one day. But why would one do a complete installation for each VM while each VM differs only on a couple of configuration files ?

This blog entry explains how to use an Oracle VM template to provision Virtual Guest Operating Systems for in a Grid situation.

For educational purposes, where classes with a lot of students have to work each with their own Grid environment, a procedure is worked out to provision a blade system with operating systems and software, Grid ready, all based on Oracle templates.

As said, more options are possible, this is how my solution works, it may work for you also. 1. An example OS configuration is provided (node specific configuration files). From that template files a VM Guest specific configuration is generated automatically. This configuration describes settings for hostname, ipnumbers, etc.
2. A vm template (image) is provided.

By automating the two steps above, one can easily and quickly setup Virtualized Oracle Linux Nodes, ready for RAC.

The next chapter will be about the configuration templates and the cloning process

The process (Chapter 2)

With this configuration templates as described earlier, "configuration clones" can be made. In this example I am using HP blade technology. On each blade six VMs will be running. For each blade and for each VM running on top of that the configuration files are generated.

It makes sense to define configuration templates. With the use of scripts you could use these templates and generate configuration files for each specific vm.

With a VM template in one hand, and an automatically generated set of configuration files in the other you can quickly build, or rebuild the infrastructure over and over again.

Even if you need to make changes that reflect all vm's, they can be rolled out quite quickly.

As said, this solution is extremely useful for education purposes, or situations where you have to provide lots of VM guests ready to be used instantly. Possible other uses are in proof of concept environments.

In short the work flow of the cloning process looks like the following: 1. A default virtual machine image is copied over
2. Configuration files for the VM are generated, based upon the blade number and vm number and purpose of the VM
3. The VM image is "mounted" and configuration files are overwritten with the generated configuration files. Also binaries (other programs) are put in place
4. The VM image is unmounted and if needed "file based shared storage" is created.
5. The VM boots for the first time, ready to use immediately, totally pre-configured


The concept itself can of course also be used for the Linux provisioning of your virtualized infrastructure as an alternative to bare metal provisioning.

The next chapter will describe the hardware used and the chosen storage solutions for this example.

Hardware used (Chapter 3)

As discussed in the previous chapter, this project is build on HP blade technology.

The solution described is of course independent of the hardware chosen.

However, in order to describe the complete setup this chapter is here to describe the hardware used.

blade01.JPG

This blade enclosure (C3000) has eight blades, each blade has:
- two nics (broadcom)
- two hba's (qlogic)
- 16 GB of RAM
- two quad core Intel Xeon processors


Storage to the blades is made available by NFS and Fiber Channel

The NFS share is used to provide the VM template that will be used as source.

The same NFS share is also available to the VM guests in order to provide the guests the option to install software from a shared location.

The SAN Storage comes from an HP MSA. This MSA devices are used for OCFS2. This is where the VM images files will be placed

Each blade is available by a public network interface.

Also a private network is setup as interconnect network for OCFS2 between the blades.

For each blade the architecture be equal to the diagram below.

blade02.jpg

VM distribution (Chapter 4)

As said in an earlier chapter, each blade has 16GB RAM, so this is enough to run at least 6 VMs of 2GB RAM each.

The purpose is to have:
- 3 vms for Real Application Clusters (RAC) (11.1.0.7 CRS/ASM/RDBMS)
- 1 vm for Dataguard (11.1.0.7 ASM/RDBMS)
- 1 vm to run swingbench and demo applications
- 1 vm to run Enterprise Manager grid Control (EMGC).


This will look this way:
blade03.jpg

As each blade has 146 GB local storage, there is room to have some VM's on local disks. Since, there is no intention to live migrate these nodes they can be put on a non-shared location.

VM number six (EMGC) is too big to fit next to the other VMs on local storage. For reason a shared OCFS mount is made.

Each VM uses the Oracle VM provided location for the VMs (/OVS/running_pool) With symbolic links the storage for the EMGC vm is brought to the OCFS2 shared disk: GRIDNODE09 -> /OVS_shared_large/running_pool/oemgc/nlhpblade07

By default OCFS2 allows four nodes to concurrently mount OCFS2 filesystem. In order to mount the OCFS2 filesystem on all blades concurrently you have to specify the –N X argument with the execution of mkfs where X is the max. number of nodes that will concurrently mount the OCFS filesystem ever.

mkfs.ocfs2 -b 4K -C 32K -N 8 -L ovmsdisk /dev/sdb1


PV Templates (Chapter 5)

Before doing any specific VM changes, first a template is chosen, in this case Oracle Enterprise Linux 5 update 2 (OEL5U2).

This is an Oracle VM template downloaded from OTN.

Our template is a para-virtualized template, based on a 32bit architecture.

To remind you, this is how the para-virtualized architecture looks: blade04.jpg

b.t.w. para-virtualized kernels often work faster then hardware virtualized guests.

Please see this link for more information on hardware v.s. para-virtualized guests

As part of the procedure described, the template will be copied over six times to each blade. In order to use the VMs on a specific blade for a specific purpose configuration files must be made. The next chapter describes how this works.

VM Specific files and clone procedure (Chapter 6)

Each virtualized guest has a small set of configuration files that are specific for that OS. Typically these files exists outside of the guest (vm.cfg) and inside the guest.

Specific files inside the vm:
- /etc/sysconfig/network-scripts/ifcfg-eth*
- /etc/sysconfig/network
- ssh configuration files

Specific files outside the vm:
- vm.cfg

For VMs running on the same blade (and being part of the same 'grid') there are also files in common:
- nsswitch.conf
- resolv.conf
- sudoers
- sysctl.conf
- hosts

The files mentioned above need to be changed. This is because of the fact each machine needs it's own NIC's with specific MAC Addresses and it's own ip-numbers.

Of course, within a grid (on a blade) each VM has to have a unique name.

In order to make sure unique MAC addresses will be generated, one has to setup standards.

For the MAC addresses, the following formula is used: 00:16:3E:XD:0Y:0Z, where:
X: the number of the blade
Y: the number of the VM,
Z: the number of the NIC within that VM.

Host names will be used multiple times (but not within the same grid), the only thing that needs to change are the corresponding ip-numbers, these must be unique across the grids.
For example, the MAC address for the second NIC on the third VM on blade 7 would look like: HWADDR=00:16:3E:7D:03:02
The same strategy is used to determine the ip-numbers to be used:
- For the public network 192.168.200.1XY is used.
- For the internal network 10.0.0.1XY is used
- For the vip 192.168.200.XY is used.

Where:
X: the number of the Blade
Y: the number of the VM

For example:
- the public ip-number of node 3 on blade7 would be: 192.168.200.173
- the private ip-number of node 3 on blade7 would be: 10.0.0.173
- the virtual ip-number of node 3 on blade7 would be:192.168.200.73

So, from here, as long as you know for which blade and for which VM you will be generating the configuration, you can script that:
[root@nlhpblade07 tools]# ./clone_conf.sh nlhpblade01
Copying config files from /OVS_shared_large/conf/nlhpblade07 to /OVS_shared_large/conf/nlhpblade01...
Performing config changes specific to the blade and the VM...
# nlhpblade01 - GRIDNODE01 - /OVS_shared_large/conf/nlhpblade01/GRIDNODE01/ifcfg-eth0
# nlhpblade01 - GRIDNODE01 - /OVS_shared_large/conf/nlhpblade01/GRIDNODE01/ifcfg-eth1
# nlhpblade01 - GRIDNODE01 - /OVS_shared_large/conf/nlhpblade01/GRIDNODE01/network
# nlhpblade01 - GRIDNODE01 - /OVS_shared_large/conf/nlhpblade01/GRIDNODE01/vm.cfg
# nlhpblade01 - GRIDNODE02 - /OVS_shared_large/conf/nlhpblade01/GRIDNODE02/ifcfg-eth0
# nlhpblade01 - GRIDNODE02 - /OVS_shared_large/conf/nlhpblade01/GRIDNODE02/ifcfg-eth1
# nlhpblade01 - GRIDNODE02 - /OVS_shared_large/conf/nlhpblade01/GRIDNODE02/network
# nlhpblade01 - GRIDNODE02 - /OVS_shared_large/conf/nlhpblade01/GRIDNODE02/vm.cfg
# nlhpblade01 - GRIDNODE03 - /OVS_shared_large/conf/nlhpblade01/GRIDNODE03/ifcfg-eth0
# nlhpblade01 - GRIDNODE03 - /OVS_shared_large/conf/nlhpblade01/GRIDNODE03/ifcfg-eth1
# nlhpblade01 - GRIDNODE03 - /OVS_shared_large/conf/nlhpblade01/GRIDNODE03/network
# nlhpblade01 - GRIDNODE03 - /OVS_shared_large/conf/nlhpblade01/GRIDNODE03/vm.cfg
# nlhpblade01 - GRIDNODE04 - /OVS_shared_large/conf/nlhpblade01/GRIDNODE04/ifcfg-eth0
# nlhpblade01 - GRIDNODE04 - /OVS_shared_large/conf/nlhpblade01/GRIDNODE04/ifcfg-eth1
# nlhpblade01 - GRIDNODE04 - /OVS_shared_large/conf/nlhpblade01/GRIDNODE04/network
# nlhpblade01 - GRIDNODE04 - /OVS_shared_large/conf/nlhpblade01/GRIDNODE04/vm.cfg
# nlhpblade01 - GRIDNODE05 - /OVS_shared_large/conf/nlhpblade01/GRIDNODE05/ifcfg-eth0
# nlhpblade01 - GRIDNODE05 - /OVS_shared_large/conf/nlhpblade01/GRIDNODE05/ifcfg-eth1
# nlhpblade01 - GRIDNODE05 - /OVS_shared_large/conf/nlhpblade01/GRIDNODE05/network
# nlhpblade01 - GRIDNODE05 - /OVS_shared_large/conf/nlhpblade01/GRIDNODE05/vm.cfg
# nlhpblade01 - GRIDNODE09 - /OVS_shared_large/conf/nlhpblade01/GRIDNODE09/ifcfg-eth0
# nlhpblade01 - GRIDNODE09 - /OVS_shared_large/conf/nlhpblade01/GRIDNODE09/network
# nlhpblade01 - GRIDNODE09 - /OVS_shared_large/conf/nlhpblade01/GRIDNODE09/vm.cfg
Performing node common changes for the configuration files...
# nlhpblade01 - GRIDNODE09 - /OVS_shared_large/conf/nlhpblade01/common/cluster.conf
# nlhpblade01 - GRIDNODE09 - /OVS_shared_large/conf/nlhpblade01/common/hosts
[root@nlhpblade07 tools]# 
'mounting a vm' (Chapter 7)

Now that we generated the node specific configuration files and copied the basic template we are ready to modify the OS before even booting it. What will happen after 'mounting' the VM image file is that the generated configuration will be copied over into the VM.

As said, at this moment the VM is an image file, for example /OVS/running_pool/GRIDNODE01/system.img. XEN will setup a loop in order to boot the OS from that image.

We do kind of the same in order to change the OS before we boot it:

First, the losetup command is used to associate a loop device with the file. A loop device, is a pseudo-device that makes a file accessible as a block device.
[root@nlhpblade07 GRIDNODE03]#  losetup /dev/loop9 system.img
Now we have mapped the image file to a block device, we want to see the partitions on that. For this we use the command kpartx. Kpartx creates device maps from partitioned tables. Kpart is part of device-mapper multipath
[root@nlhpblade07 GRIDNODE03]# kpartx -a /dev/loop9
So, lets see what partitions device-mapper has for us:
[root@nlhpblade07 GRIDNODE03]# ls /dev/mapper/loop9*
/dev/mapper/loop9p1  /dev/mapper/loop9p2  /dev/mapper/loop9p3
kpartx found three partitions and told DM there are three partitions available. Let's see if we can identify the types:
[root@nlhpblade07 GRIDNODE03]# file -s /dev/mapper/loop9p1
/dev/mapper/loop9p1: Linux rev 1.0 ext3 filesystem data
This is probably the /boot partition of the vm.
[root@nlhpblade07 GRIDNODE03]# file -s /dev/mapper/loop9p2
/dev/mapper/loop9p2: LVM2 (Linux Logical Volume Manager) , UUID: t2SAm03KoxfUcCOS3OYmsXf9ubqcy9q
This maybe the root or the swap partition
[root@nlhpblade07 GRIDNODE03]# file -s /dev/mapper/loop9p3
/dev/mapper/loop9p3: LVM2 (Linux Logical Volume Manager) , UUID: j2U7KUWen1ePjDvm4hTclZvA5YJyvl9
[root@nlhpblade07 GRIDNODE03]# fdisk -l /dev/mapper/loop9p2
This may also be the root or the swap partition

So, in order to make a better guess in finding the root partition, let's see what the sizes are:
[root@nlhpblade07 GRIDNODE03]# fdisk -l /dev/mapper/loop9p2

Disk /dev/mapper/loop9p2: 13.8 GB, 13851371520 bytes
255 heads, 63 sectors/track, 1684 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/mapper/loop9p2 doesn't contain a valid partition table

[root@nlhpblade07 GRIDNODE03]# fdisk -l /dev/mapper/loop9p3

Disk /dev/mapper/loop9p3: 5362 MB, 5362882560 bytes
255 heads, 63 sectors/track, 652 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/mapper/loop9p3 doesn't contain a valid partition table
As we can see, one partition is 5GB and the other is 13GB. Best guess would be, the 5GB partion is the swap and the 13GB partition the OS.

With the command vgscan we can scan the newly 'discovered' 'disks' and search for volume groups on them:
[root@nlhpblade07 GRIDNODE03]# vgscan
  Reading all physical volumes.  This may take a while...
  Found volume group "VolGroup00" using metadata type lvm2
vgdisplay says we have one volume group (VolGroup00):
[root@nlhpblade07 GRIDNODE03]# vgdisplay
  --- Volume group ---
  VG Name               VolGroup00
  System ID             
  Format                lvm2
  Metadata Areas        2
  Metadata Sequence No  5
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                2
  Open LV               0
  Max PV                0
  Cur PV                2
  Act PV                2
  VG Size               17.84 GB
  PE Size               32.00 MB
  Total PE              571
  Alloc PE / Size       571 / 17.84 GB
  Free  PE / Size       0 / 0   
  VG UUID               kmhYBm-Mpbv-usx2-vDur-rEVb-uP4i-kcP4fc
With the command, vgchange -a we can make logical volumes available to use for the kernel.
[root@nlhpblade07 GRIDNODE03]# vgchange -a y VolGroup00
  2 logical volume(s) in volume group "VolGroup00" now active
[root@nlhpblade07 GRIDNODE03]# lvdisplay
lvdisplay can be use to see to see the attributes of a logical volume:
[root@nlhpblade07 GRIDNODE03]# lvdisplay
  --- Logical volume ---
  LV Name                /dev/VolGroup00/LogVol00
  VG Name                VolGroup00
  LV UUID                B13hk3-f5qY-3gDY-Ackt-13gK-DZDc-cTWx3V
  LV Write Access        read/write
  LV Status              available
  # open                 0
  LV Size                14.72 GB
  Current LE             471
  Segments               2
  Allocation             inherit
  Read ahead sectors     0
  Block device           253:3
   
  --- Logical volume ---
  LV Name                /dev/VolGroup00/LogVol01
  VG Name                VolGroup00
  LV UUID                iEO4oG-XPMU-syWF-qupo-811i-G6Gg-QZEw5f
  LV Write Access        read/write
  LV Status              available
  # open                 0
  LV Size                3.12 GB
  Current LE             100
  Segments               1
  Allocation             inherit
  Read ahead sectors     0
  Block device           253:4
So, now we found, (and made available to the logical volume) the root filesystem where the VM is on. Now we can mount that:
[root@nlhpblade07 GRIDNODE03]# mkdir guest_local_LogVol00; 
[root@nlhpblade07 GRIDNODE03]# mount /dev/VolGroup00/LogVol00 guest_local_LogVol00 
See the contents of the filesystem:
[root@nlhpblade07 GRIDNODE03]# cd guest_local_LogVol00/

[root@nlhpblade07 guest_local_LogVol00]# ls -la
total 224
drwxr-xr-x 26 root root  4096 Jan 14  2009 .
drwxr-xr-x  3 root root  4096 Oct 22 22:30 ..
-rw-r--r--  1 root root     0 Jul 24 05:02 .autorelabel
drwxr-xr-x  2 root root  4096 Dec 20  2008 bin
drwxr-xr-x  2 root root  4096 Jun  6 11:26 boot
drwxr-xr-x  4 root root  4096 Jun  6 11:26 dev
drwxr-xr-x 94 root root 12288 Jan 14  2009 etc
drwxr-xr-x  3 root root  4096 Jun  6 11:50 home
drwxr-xr-x 14 root root  4096 Dec 20  2008 lib
drwx------  2 root root 16384 Jun  6 11:26 lost+found
drwxr-xr-x  2 root root  4096 Apr 21  2008 media
drwxr-xr-x  2 root root  4096 May 22 09:51 misc
drwxr-xr-x  3 root root  4096 Dec 20  2008 mnt
dr-xr-xr-x  2 root root  4096 Jun 10 11:11 net
drwxr-xr-x  3 root root  4096 Aug 21 04:11 opt
-rw-r--r--  1 root root     0 Jan 14  2009 poweroff
drwxr-xr-x  2 root root  4096 Jun  6 11:26 proc
drwxr-x--- 17 root root  4096 Jan 13  2009 root
drwxr-xr-x  2 root root 12288 Dec 20  2008 sbin
drwxr-xr-x  4  500  500  4096 Jan 14  2009 scratch
drwxr-xr-x  2 root root  4096 Jun  6 11:26 selinux
drwxr-xr-x  2 root root  4096 Apr 21  2008 srv
drwxr-xr-x  2 root root  4096 Jun  6 11:26 sys
drwxr-xr-x  3 root root  4096 Jun  6 11:33 tftpboot
drwxrwxrwt  9 root root  4096 Jan 14  2009 tmp
drwxr-xr-x  3 root root  4096 Dec 20  2008 u01
drwxr-xr-x 14 root root  4096 Jun  6 11:31 usr
drwxr-xr-x 21 root root  4096 Jun  6 11:37 var
As this seems a rather easy way to mount a vm image file, it is still not something you will do very quickly for 40 VM images very quickly.

For this reason, the described solution is scripted and called mount_vm.sh. This is how it works:
[root@nlhpblade07 GRIDNODE05]# mount_vm.sh GRIDNODE05
Starting mount...

contents /etc/sysconfig/network -file of mounted node:
NETWORKING=yes

NETWORKING_IPV6=no
HOSTNAME=gridnode05.nl.oracle.com

Generating unmount script
To unmount your image run  /tmp/umount_GRIDNODE05.30992.sh as root

Mounting finished...
As you can see the images is mounted my a script and a script to unmount is automatically generated. In order to verify the right image file is mounted the contents of the file /etc/sysconfig/network is shown.

'changing a vm' (Chapter 8)

Now the vm image is mounted to the filesystem, we can go back to the generated config files. From here it is easy to copy over all specific configuration files to the vm. Better, would be to make a script available to do this, and that is done for this solution:
[root@nlhpblade07 GRIDNODE05]# change_vm.sh GRIDNODE05
If you are sure, hit Y or y to continue
Y
Continuing...
Starting config change for VM GRIDNODE05 on nlhpblade07...
Copying swingbench...
Changing ownership of swingbench files...
Copying FCF-Java Demo...
Changing ownership of  FCF-Java Demo files...
This vm requires pre-build file /OVS/sharedDisk/rdbms_home_11r1_01_ocfs.img as shared Oracle RDBMS HOME
Finished changing config...
Now the VM is modified internally and still mounted, the unmount has to be done. This can be done by running the generated unmount script. This script was generated during the mount.
[root@nlhpblade07 GRIDNODE05]# /tmp/umount_GRIDNODE05.30992.sh

Unmount finished
If the unmount succeeded, you can remove this file

rm: remove regular file `/tmp/umount_GRIDNODE05.30992.sh'? y


All Together (Chapter 9)

In essence the procedure described above should be repeated for each VM you want to clone and change on each blade. This may already save you hours of work and reduces chances on mistakes, but still may seem a lot of steps. In order to repeat this for each blade, for each vm, from here it is just a matter of scripting.

So, you could make a script, that for each blade, would do the following:

Pseudo:
for each blade in blade list
do
 Stop all vms first
     while vms still running
     do 
        wait 10 seconds
     done
 Restore all machines (from NFS)
 Clone conf
 Mount and change all vms
 Start all
done 
For all 42 VM images the implemented version of the script above runs about four hours. After this a complete 42 node education environment is setup.
NLHPBLADE%20TA3-1.jpg
Extra Options (Chapter 10)

Besides changing only configuration settings on a VM as mentioned in the Change a VM chapter, other activities can also be done.

For this solution the following options are also implemented.
- configure vnc
- configure ocfs2 within the guest, use a shared Oracle home to save space
- copy and configure software (like an oracle home or swingbench)
- create ASM disk files
- create OCR and Voting disks
- configure sudoers
- provide software for EM Agent Deployment
- configure ssh

Rene Kundersma Oracle Expert Services, The Netherlands

March 13, 2009

Presentation on OEL, Oracle VM, Storage and RAC at Dutch Planboard Symposium

I will be doing a presentation May 26th at Planboard's Oracle DBA Symposium.

The presentation will be about all infrastructural aspects you come across setting up an Oracle Database Grid.

I will be handling Oracle Enterprise Linux, Storage, Stretched RAC Clusters, Oracle VM and HA in general.

Here is the link (yes it is Dutch)

Rene Kundersma
Oracle Expert Services, The Netherlands

March 14, 2009

Enterprise Manager 10.2.0.5 has Oracle VM Manager onboard

Today, I have tested the new Oracle VM Management capabilities that Enterprise Manager Grid Control (EMGC) 10.2.0.5 has.

The existing Oracle VM Manager, is rather quick and light weight, so you will probably think why do we need EMGC for that.

Well, think about the following advantages:
- All in one management console
- EMGC runs also on Windows, so now you can manage Oracle VM from windows, which was not possible before
- You can set thresholds
- Use the Single Sign On that EMGC has
- Use the security features EMGC has
- Setting notifications and alerts so that admins can manage on exception basis.
- Context based flows (even if manual currently).
For example, change the vCPU on para-virtualized linux guests if database diagnostics (ADDM) suggests additional CPU
- Specific EM features like Configuration compare, search, policies for host and hypervisor. The policies can span multiple tiers and can be user defined. Example, do not run mixed workloads (database and middleware) on the same Hypervisor.
- Guest patching with integration with Unbreakable Linux Network (ULN)

Some quick requirements had to be filled in before I could begin:
- install a one-off patch to the OMS (patch 8244731)
- install VnCViewer into $ORACLE_HOME/j2ee/OC4J_EM/applications/em/em
- also had to be sure Oracle VM Agent was updated to version 2.2-70 or higher.

This all went smoothly. Please note that it makes sense to install the patch and the VnCViewer at the same time when the OMS is down.

Please read note 781879.1 for full details (as information may change and there is only one singe source of truth)

First I had to create a server pool. After creating a server pool, it was just a matter of making the Oracle VM Server aware of the Vm images I already had on disk.

And exactly for this was an option, which you can see in the capture below.

Snap221.jpg


This is the place where you can see all the VM Images discovered in your virtual server pool.
After this, it is just a matter of selecting the images and "importing" them into your vm "admin" page.

From here you can do the usual; start, stop, pause, suspend, clone and get to the console.

Snappie.jpg

To me this seems a new step taken for EMGC to be the complete management framework.

Rene Kundersma
Oracle Expert Services, The Netherlands

March 24, 2009

'Virtual Metal' Provisioning with Oracle VM and PXE

Basis for Bare Metal Provisioning (BMP) in EMGC 10.2.0.5 is as mentioned in an earlier blog entry "PXE boot".

snap-rac-vm00042.jpg
This blog entry describes how to setup PXE boot (TFTP and DHCP) for a para-virtualised guests.
This allows you to automatically install virtualised guests by kickstart file.

By the way, in this setup I am on OEl 5U2 x86, if you want to reproduce for say x86_64, you may need other packages.

Below are my notes of the setup:
- install dhcp-3.0.5-18.el5
- install tftp-0.42-3.1.0.1 (we need this one later a required package for pypxeboot)
- install tftp-server-0.42-3.1.0.1

After installation of these packages, we begin with the configuration of dhcp in /etc/dhcpd.conf.
As this is just a test I am not using all options for DHCP.
Be care full if you test this, DHCP be working too good...
#
# DHCP Server Configuration file.
#   see /usr/share/doc/dhcp*/dhcpd.conf.sample  
#
ddns-update-style none;
allow booting; 
allow bootp;   

subnet 192.168.200.0 netmask 255.255.255.0 {
    option routers             192.168.200.1;
    option subnet-mask         255.255.255.0;
    option nis-domain          "nl.oracle.com";
    option domain-name         "nl.oracle.com";
    option domain-name-servers 192.135.82.60;

    default-lease-time 60;
    max-lease-time 60;
 
    next-server 192.168.200.173;
    filename "/pxelinux.0";

    host RK{
    hardware ethernet 00:16:3e:62:39:d3;
    fixed-address 192.168.200.177;
    }
}
As you can see I specified subnet, netmask, domain-name and details for the host called "RK".
Details are: name, mac and ip address.

The purpose of the "next-server" is to specify the name (or ip) of the tftp-server.
It makes sense to put DHCP and TFTP server on the same box.

In order to (re)start dhcp:
service dhcpd restart 
After setting up DHCP, TFTP needs to be setup. This is just a matter of enabling the service in inetd.
Set disable = no in the file /etc/xinetd.d/tftp. After this, restart service xinetd.

Pxeboot files need to be copied to /tftpboot on the tftp-server:
cp /usr/lib/syslinux/pxelinux.0 /tftpboot/
cp /usr/lib/syslinux/mboot.c32 /tftpboot/
From your OEL distribution, copy the boot-installation files:
cp $MOUNT_OEL_DISTR/images/xen/* /tftpboot/
Create a PXE configuration file for the guest you want to start:
[root@gridnode03 pxelinux.cfg]# gethostip -x 192.168.200.177
C0A8C8B1
So for a guest with ip-number 192.168.200.177 we need to put the details for the PV-PXE installation into /tftpboot/pxelinux.cfg/C0A8C8B1
[root@gridnode03 ~]# cat /tftpboot/pxelinux.cfg/C0A8C8B1 
default linux
prompt 1
timeout 120
label linux
  kernel vmlinuz
  append initrd=initrd.img lang=en_US keymap=us \
  ks=nfs:192.168.200.200:/vol/vol1/distrib/linux32/workshop-ovs/oel/OEL5U2/ks.cfg \  
  ksdevice=eth0 ip=dhcp
You can see:
- my OEL kickstart-file is on NFS (as my installation)
- the ip number is obtained by ip using eth0

I created my kickstart from an existing OEL installation.
With the help of the command system-config-kickstart --generate I re-generated it.

After this, I had to modify some bits about installation media (from cdrom to nfs).
Specifics for my kickstart file here

See the Redhat site for all options of kickstart.

Before I could start a vm guest I also, had to:
- install pypxeboot and
- install udhcp-0.9.8-1usermac

Then, created a vm configuration file:
[root@nlhpblade07 pxe]# cat rk.cfg 
name = "RK"
memory = "1024"
disk = [ 'file:/OVS/running_pool/pxe/system.img,xvda,w',]
vif = [ 'mac=00:16:3e:62:39:d3,bridge=xenbr0', '', ]
vfb = ["type=vnc,vncunused=1,vnclisten=0.0.0.0"]
#bootloader="/usr/bin/pygrub"
bootloader="/usr/bin/pypxeboot"
bootargs=vif[0]
vcpus=1
on_reboot   = 'restart'
on_crash    = 'restart'
Before I could start the VM, the 'disk' (image) had to be in place:
[root@nlhpblade07 pxe]# dd if=/dev/zero of=system.img bs=1M count=8000
8000+0 records in
8000+0 records out
8388608000 bytes (8.4 GB) copied, 165.725 seconds, 50.6 MB/s
[root@nlhpblade07 pxe]# 
So, after starting, remember that the third console of the installation enables you to see what is going on during the run of the anaconda installation procedure:
snap-rac-vm00037.jpg
After installation and before the reboot the vm-config file had to be modified and looks like this:
[root@nlhpblade07 pxe]# cat rk.cfg
name = "RK"
memory = "1024"
disk = [ 'file:/OVS/running_pool/pxe/system.img,xvda,w',]
vif = [ 'mac=00:16:3e:62:39:d3,bridge=xenbr0', '', ]
vfb = ["type=vnc,vncunused=1,vnclisten=0.0.0.0"]
bootloader="/usr/bin/pygrub"
vcpus=1
on_reboot   = 'restart'
on_crash    = 'restart'
snap-rac-vm00040.jpg After a successful installation the OS is setup and ready to be used: snap-rac-vm00041.jpg Rene Kundersma
Oracle Expert Services, The Netherlands

About March 2009

This page contains all entries posted to Oracle XPS The Netherlands On HA in March 2009. They are listed from oldest to newest.

February 2009 is the previous archive.

April 2009 is the next archive.

Many more can be found on the main index page or by looking through the archives.

Powered by
Movable Type and Oracle