Tuesday Nov 18, 2008

Sun HPC Linux Stack on VirtualBOX

Our team just recently released Sun HPC Software Linux Edition 1.1 release. For those who are interested in trying out the stack but have no access to many hardware, Sun xVM VirtualBox could turn your laptop into a HPC development platform. VirtualBox 2.0.4 supports 64bit virtualization on top of 64bit host OS. Unfortunately Mac OS users have to wait for later release to 64bit guest OS support.

Here is the specs of my laptop and a list of software I used.

My Laptop - Levovo T61

Hardware Specification
Softare Configuration
Intel(R) Core(TM)2 Duo CPU T7500 @ 2.20GHz
250G harkdisk
Ubuntu 8.10
Sun xVM VirtualBox 2.0.4 amd64 edition

Note: Some computers' motherboards might not have Intel VT or AMD-V enabled. In order to support 64 bit vitualization, these options must be enabled. Thank Liu Lei for pointing it out! (Nov 22 2008)

1. Setup Testing platform

1. Get Sun HPC 1.1 iso and CentOS-5.2-x86_64-bin-DVD.iso

2. Create a virtual machine “CentOS-mgmt1” in VirtualBox. I assigned 512MB ram and created 20G virtual disk image for it. Since I choose to use dynamic disk, it actually does not occupy that much disk space.

3. Mount CentOS-5.2-x86_64-bin-DVD.iso image to CD/DVD Drive

4. Tick "VT-x/AMD-V" feature

5. Enable to two network adapters. Adapter1 is attached NAT and Adapter2 is attached to the  internal network noted as "intnet".

5. Start the virtual machine and install RHEL 5.2 on it. Installation procedure should be pretty straightforward with RHEL wizard. I basically click next all the way except assigning hosts name as "mgmt1".

6. During the post installation configuration after reboot,  I disabled both firewall and selinux since I am not interested in testing security features.

7. Restart the virtual machine. By default network adapter 2, "eth1", is likely deactivated.

[root@mgmt1 ~]# uname -r
[root@mgmt1 ~]# ifconfig -a
eth0      Link encap:Ethernet  HWaddr 08:00:27:63:5A:4F 
          inet addr:  Bcast:  Mask:
          inet6 addr: fe80::a00:27ff:fe63:5a4f/64 Scope:Link
          RX packets:89 errors:0 dropped:0 overruns:0 frame:0
          TX packets:100 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:9008 (8.7 KiB)  TX bytes:15225 (14.8 KiB)
          Interrupt:177 Base address:0xc020

eth1      Link encap:Ethernet  HWaddr 08:00:27:95:5E:AE 
          BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
          Interrupt:185 Base address:0xc060

We need to assign static IP address for eth1 in order to active it. We could use Network Configuration tool in Gnome.

Select "eth1" and click "Edit". Then type in static IP address as below

Tick the option of "Activate devices when computer starts.

After that, it is a bit tricky to find "OK" button because of resolution of screen. The solution is to press ALT key first and we can move the window position.

Click "Active" to activate eth1.

Now we should both Internet access through eth0 and an internal network on eth1.

[root@mgmt1 ~]# wget www.sun.com
--12:40:24--  http://www.sun.com/
Resolving www.sun.com...
Connecting to www.sun.com||:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: `index.html.1'

    [       <=>                              ] 30,441      10.2K/s   in 2.9s  

12:40:34 (10.2 KB/s) - `index.html.1' saved [30441]

8. Enable ssh access to mgmt1

I personally find it very convenient to be able to connect to the virtual box through SSH. In particular, I could copy & paste some long commands in the ssh terminal before I install Guest Additions.

Open a terminal on the host machine and enter the following commands:

zhiqi@tao:~$ VBoxManage setextradata "CentOS-mgmt1" "VBoxInternal/Devices/pcnet/0/LUN#0/Config/guestssh/Protocol" TCP
VirtualBox Command Line Management Interface Version 2.0.4
(C) 2005-2008 Sun Microsystems, Inc.
All rights reserved.

zhiqi@tao:~$ VBoxManage setextradata "CentOS-mgmt1" "VBoxInternal/Devices/pcnet/0/LUN#0/Config/guestssh/GuestPort" 22
VirtualBox Command Line Management Interface Version 2.0.4
(C) 2005-2008 Sun Microsystems, Inc.
All rights reserved.

zhiqi@tao:~$ VBoxManage setextradata "CentOS-mgmt1" "VBoxInternal/Devices/pcnet/0/LUN#0/Config/guestssh/HostPort" 2301
VirtualBox Command Line Management Interface Version 2.0.4
(C) 2005-2008 Sun Microsystems, Inc.
All rights reserved.

Then, we could use ssh and sftp to access virtual machine "rhel-mgmt1".

zhiqi@tao:~$ ssh -p 2301 root@localhost
root@localhost's password:
Last login: Wed Nov  5 17:50:01 2008
[root@mgmt1 ~]#

zhiqi@tao:~$ sftp -o port=2301 root@localhost
Connecting to localhost...
root@localhost's password:

9. Update to the latest kernel
In order to install Sun HPC stack release 1.1, mgmt1 must run on the supported kernel. Red Hat and CentOS released a kernel update, kernel-2.6.18-92.1.17.el5. We have been working on to support this version in release 1.1.1. Before we release the updates to support that kernel, please update kernel manually.

[root@mgmt1 ~]# wget http://mirror.pacific.net.au/linux/CentOS/5.2/updates/x86_64/RPMS/kernel-2.6.18-92.1.13.el5.x86_64.rpm

[root@mgmt1 ~]# rpm -ivh kernel-2.6.18-92.1.13.el5.x86_64.rpm

Once the update finished, reboot mgmt1. The kernel should be 2.6.18-92

[root@mgmt1 ~]# uname -r

The virtual machine, "rhel-mgmt1", is ready to be the head node of virtual clusters.

2. Install Sun HPC software stack

1. Mount Sun HPC 1.1 release ISO on CD/DVD drive
Once we login to the Gnome desktop environment, Gnome automount feature would automatically mount Sun HPC 1.1 release at /media/sun_hpc_release.

[root@mgmt1 ~]# ls /media/sun_hpc_linux/
repodata  rhel5.cfg  SunHPC  sun-linux-hpc-1.1beta-Install-Guide.pdf  usb.cfg

2. Install Sun HPC 1.1 release stack

[root@mgmt1 ~]# rpm -ivh /media/sun_hpc_linux/SunHPC/sunhpc-release.rpm
Preparing...                ########################################### [100%]
   1:sunhpc-release         ########################################### [100%]

3. Install software on the head node

[root@mgmt1 ~]# sunhpc_installer

Type "c" to continue installation

Note: Java-1.4.2-gcj-compat and packages having dependence with it would be removed during installation because of conflict. java-1.4.2-gcj-compat with Sun Java Runtime Environment.

The total installation process varies largely depending on the network connection to CentOS repository.

3. Prepare Cobbler provision server

Cobbler is a Linux provisioning server that provides tools for automating software installation on large numbers of Linux systems, including PXE configurations and boots, re-installation, and virtualization. For more information about using Cobbler, see https://fedorahosted.org/cobbler .

1. Turn off IP Table on the head node.

iptables are disabled to allow full access by all networking devices to the management Ethernet network.

[root@mgmt1 ~]# /etc/init.d/iptables stop

[root@mgmt1 ~]# /etc/init.d/ip6tables stop

[root@mgmt1 ~]# chkconfig --level 2345 iptables off

[root@mgmt1 ~]# chkconfig --level 2345 ip6tables off

[root@mgmt1 ~]# chkconfig --list | grep tables
ip6tables          0:off    1:off    2:off    3:off    4:off    5:off    6:off
iptables           0:off    1:off    2:off    3:off    4:off    5:off    6:off

2. Configure Cobbler

In order to make cobbler working, we need to configure cobbler dhcp server ("server";), tftp se server ("next_server";) and provisioning network interface ("ksdevice";) in /etc/cobbler/settings. Since we use the head node (mgmt1) to serve both dhcp server and tftp server, we assign both "server" and "next_server" with the ip address of mgmt1's. Since we use eth0 as the interface to the internet and eth1 as the interface to the internal network interface to other nodes in the cluster, we assign "ksdevice" with "eth1"

[root@mgmt1 ~]# vi /etc/cobbler/settings

ksdevice: eth1
next_server: ''
server: ''

We also need to modify the dhcpd.template file for the provisioning network. In this example, the subnet starts at and the subnet mask is In the dhcpd.template file, we need to edit subnet and netmask field accordingly.

[root@mgmt1 ~]# vi /etc/cobbler/dhcp.template

- snip -
subnet netmask {
     option subnet-mask;
- snip -
option root-path "$next_server:/var/lib/oneSIS/image/centos5.2,v3,tcp,hard";
- snip -

4. Provision diskless clients

1. Create one virtual machine "centOS-dllcn001" and attach its network adapter1 to internal network.

serve as a diskful client.
- Created 8G virtual disk for this one.
- Tick "VT-x/AMD-V" feature (Refer to section 1 step 4 )
- Make sure rhel-dflcn001 to boot from network

- Change network adapter 1 to Internal Network and write down the MAC address of this adapter, "080027D24CA2", which would be used as "08:00:27:D2:4C:A2" later on.

2. Make sure cobblerd, dhcpd and httpd up and running

Cobbler uses dhcpd service to assign IP addresses for the clients to be provisioned and httpd service to transfer system packages over network.

[root@mgmt1 ~]# cobbler sync

[root@mgmt1 ~]# /etc/init.d/dhcpd restart

[root@mgmt1 ~]# /etc/init.d/cobblerd restart

[root@mgmt1 ~]# /etc/init.d/httpd restart

3. Create oneSIS image for diskless clients

[root@mgmt1 ~]# onesis_setup --rootfs=/var/lib/oneSIS/image/centos5.2 --config=/usr/share/oneSIS/includes/sysimage.conf.rhel5.2 --exclude=/var/www/cobbler

4. Register the oneSIS root image and create a Cobbler profile for diskless clients.

[root@mgmt1 media]# cobbler distro add --name=onesis_centos5.2 --kernel=/tftpboot/vmlinuz-2.6.18-92.1.13.el5 --initrd=/tftpboot/initrd-2.6.18-92.1.13.el5.img

[root@mgmt1 media]# cobbler profile add --name=lustre_client --distro=onesis_centos5.2 --kopts="selinux=0 root=/dev/nfs"

5. Add dllcn001 cobbler system configuration

[root@mgmt1 ~]# cobbler system add --name=dllcn001 --mac=08:00:27:D2:4C:A2 --ip= --hostname=dllcn001 --profile=lustre_client

[root@mgmt1 ~]# cobbler sync

[root@mgmt1 ~]# /etc/init.d/dhcpd restart

[root@mgmt1 ~]# /etc/init.d/cobblerd restart

[root@mgmt1 ~]# /etc/init.d/httpd restart

6. Start centos-dllcn001 with network boot and dllcn001 would automatically start as a diskless client.

5. Create oneSIS rootfs image for Lustre Server nodes

Lustre Server nodes include MDS (Medtadata Server) and OSS (Object Storage Server) which builds lustre file system clusters. To use the Lustre filesystem option with oneSIS as the provisioning system, we need to create a separate oneSIS rootfs image for the Lustre server nodes.

1. Create a copy of the base rootfs and install Lustre server packages

[root@mgmt1 ~]# onesis_lustre_rootfs /var/lib/oneSIS/image/centos5.2 /var/lib/oneSIS/image/centos5.2-lustre

2. Point tftp server to onesis lustre image

[root@mgmt1 ~]# vi /etc/cobbler/dhcp.template
- snip -
      #option root-path "$next_server:/var/lib/oneSIS/image/centos5.2,v3,tcp,hard";
        option root-path "$next_server:/var/lib/oneSIS/image/centos5.2-lustre,v3,tcp,hard";
- snip -

3. Create virtual machine, "centos-dlmds01", as a diskless Lustre server node (similar to Section 4 step 1)
- Created 8G virtual disk for this one.
- Tick "VT-x/AMD-V" feature
- Change network adapter 1 to Internal Network and write down the MAC address of this adapter, "080027D89A5C", which would be used as "08:00:27:D8:9A:5C" later on.
- Make sure rhel-dlmds01 to boot from network

4. Add a profile for diskless Lustre servers

[root@mgmt1 tftpboot]# cobbler distro add --name=onesis_centos5.2-lustre --kernel=/tftpboot/vmlinuz-2.6.18-92.1.10.el5_lustre.1.6.6 --initrd=/tftpboot/initrd-2.6.18-92.1.10.el5_lustre.1.6.6.img
[root@mgmt1 tftpboot]# cobbler profile add --name=onesis_lustre_server --distro=onesis_rhel5.2-lustre --kopts="selinux=0 root=/dev/nfs"

[root@mgmt1 tftpboot]# cobbler system add --name=dlmds01 --mac=08:00:27:D8:9A:5C --ip= --hostname=dlmds01 --profile=onesis_lustre_server

[root@mgmt1 tftpboot]# cobbler sync

[root@mgmt1 ~]# /etc/init.d/dhcpd restart

[root@mgmt1 ~]# /etc/init.d/cobblerd restart

[root@mgmt1 ~]# /etc/init.d/httpd restart

5. Boot rhel-dlmds001

Repeat the similar process to provision more client nodes and then you will be ready to try out Lustre File System, which powers seven out of ten world's largest super computers on the latest top500 list, and test your mpi programs.


  • Sun xVM VirtualBox 2.0.4:
The best desktop virtualization software supports most of popular OS platforms and 64bit Guest OSs on 64bit Linux, Solaris and Windows host OS.

  • Sun HPC Software Linux Edition
An integrated, open-source software solution for Sun HPC clusters simplifies the deployment of HPC clusters by providing a ready-made framework of software components to use to turn a bare-metal system into a running HPC cluster.
  • Sun Lustre Filesystems
An object-based cluster file system redefines scalability and provides groundbreaking I/O and metadata throughput. The  majority of top 10 of world largest super computers are powered by Lustre.
If there is anyone out there who still puzzles how to store one 10 TeraByte file, who still struggles to reach several hundreds GigaByte per second read/write, who still suffers from the hidden problems on the Proprietary file systems, Lustre is your answer.  [yes, I learned this sentence from Obama. :-) ]

Change log:

  1. It is not necessary to run "yum --exclude=kernel -y update" since Whateverpackages are needed by SunHPC stack packages would be pulled together via dependence check. This could save a considerable amount of time for the case of limited Internet bandwidth. Thank Liu Lei for this observation! (Nov 23 2008)

Tuesday Sep 16, 2008

Farewell Campus Ambassador program and move to HPC

It was a very enjoyable experience for me to be a part of the ambassador program. The time spent working with Sun technologies is extreme fulfilling. I do feel that CA community is a large family and we look after each other.

I must say, I sincerely appreciate my program manager, Ganesh Hiregoudar, the most. Although I have not had the honor to meet him in person yet, I learned countless from his supervision and wisdom.

I've now joined Linux HPC stack team full time. If you or any of students and staffs in your universities are interested in cluster computing, high performance computing and distributed file systems, please for sure recommend Sun HPC technologies, particularly, Lustre file systems, which is the only real competitor to IBM's GPFS. 7 out of 10 world fastest super computers are powered by Lustre. And, it is Sun technology, of course it is open source!


I have passed on the Campus Ambassador role in University of Melbourne to Mohammed Jubaer Arif. He is very passionate about Sun technologies. I believe his contribution of campus ambassador program would be exceptional. In a very short time that he is on this role, Arif has already made several interesting proposals.

To everyone in the CA program: great job! My best wishes to all of you!


Occasionally I like to share some ideas or interesting news here.


« July 2016