Tuesday Aug 05, 2008

The hype of the cloud

I just read a candid article on Linux magazine about the hype of cloud computing and I find it quite funny. A nice read for those who are also caught up in th fad, like me. :)


EDIT: Dell is trying to trademark the term Cloud Computing: http://www.eweek.com/c/a/Infrastructure/Dell-Attempts-to-Trademark-Cloud-Computing/

Friday Aug 01, 2008

Sun Shared Visualization System

Recently, one of the local IT architect asked about the Shared Visualization System and if Sun Rays can be used to view 3D graphics. And the answer is YES! The general idea about the Shared Visualization solution is that you can have a central pool of graphic resource, which potentially can be a grid of multiple, different systems with accelerated graphics capabilities. Users can then have the ability to remotely access and share 3D visualization applications on a variety of client platforms.

The main software that is used to enable this is virtualGL, an open source program which redirects the 3D rendering commands from Unix and Linux OpenGL applications to 3D accelerator hardware in a dedicated server and displays the rendered output interactively to a thin client located elsewhere on the network. The thin client can therefore be a Sun Ray, but I believe that a plug-in is required to be installed at the Sun Ray server. Sun Grid Engine is also part of the software stack to manage access to the graphic resources.

See also the slides by my Sun colleague Torben, who presented the solution at one of a Grid conference.

Serializing to XML using JAXP DOM API

I was writing some Java codes to process some XML data and encountered an annoying problem which took me a while to fix. I want to put this down here in my blog in case I forget.

I'm using JAXP API to parse my XML to DOM, I then need to extract a sub-element and serialize it back to XML as string. It is working fine when run as a standalone application, but when I used the code in a servlet deployed in Tomcat I got "Provider org.apache.xalan.processor.TransformerFactoryImpl not found" exception. Initially I thought that maybe I'm missing my xalan.jar but then after googling I find a suggestion to set the system property to use another TransformerFactory implementation:


Oh if any people out there are wondering how to get rid of the XML declaration header:
when serializing to XML, just set the output property.

DOMSource domSource = new DOMSource(xmldom);
StreamResult streamResult = new StreamResult(System.out); // print xml to stdout
TransformerFactory tf = TransformerFactory.newInstance();
Transformer serializer = tf.newTransformer();
serializer.setOutputProperty(javax.xml.transform.OutputKeys.OMIT_XML_DECLARATION, "yes");
serializer.transform(domSource, streamResult);

Friday Jun 27, 2008

IDC: Software is #1 roadblock

The recent IDC presentation at the ISC2008 conference state that software is the biggest roadblock for HPC users now... I couldn't agree more. As clusters grow larger and more complex, better management tools are required. Managing and monitoring HPC cluster typically uses different bits and pieces of different tools; setting up and operating the cluster becomes very difficult. Sun has also recognize this and also taking a very serious look at this, which is why we have the Sun HPC Software, Linux Edition. It is currently based on CentOS and still needs more work to be the complete easy-to-use management solution. We have also started the HPC-Stack project under OpenSolaris community for the OpenSolaris edition of the software stack.

Management software is only one piece of the puzzle. The other piece is the development tools and parallel libraries. As processor gets more cores and number of cores per cluster grows, many applications will need to be redesign and new programming paradigms are needed to ease development and improve efficiency.


Looks like more and more applications are moving towards nVidia GPUs and using the CUDA SDK. SciFinance from SciComp is a code synthesis technology for building derivatives pricing and risk models. Just by changing certain keyword, it can generate CUDA-enabled codes that are, according to the website, 30X-80X faster than the serial codes. Also, check out the CUDA-zone website which shows many successes of accelerating the performance of apps using CUDA and nVidia's GPU.

Tuesday Jun 03, 2008

Learning MPI tutorial

I came across a nice and simple tutorial for programmers who want a crash course in MPI, I think this is a good tutorial to start. It does not go into detail about message passing programming paradigm, but it gives step by step procedures of building the GCC, setting up the env vars, installing openmpi... on a Linux machine. Warning though, some knowledge on Linux/Unix required. Once you get everything up and running, the tutorial walks you through several examples with increasing level of complexity. The last example is matrix multiplication where it teaches how to parallelize the code using MPI.

See the tutorial here:

Wednesday May 28, 2008

Project Hydrazine

During the recent JavaOne, Sun announced a new project call Project Hydrazine that allows for the rapid creation and deployment of hosted services across multiple device types. Looking at this diagram suggests that this new service will be deployed on the existing compute infrastructure of Network.com. I guess that it will be used by Sun to offer hosting services, something akin to Network.com's Compute Utility except that it'll not just be for compute.

Some people (here and here) have referred it as Sun's new Cloud Computing platform. While the definition of Cloud Computing is subjected to much debate, I'm still very excited about it. However, I'm very curious on how does Project Caroline fit into all this, assuming there is any relation at all?

Read more about: "Sun's do-it-all cloud" from eWeek.

Video introduction to Project Hydrazine:

Thursday May 15, 2008

Install IPS

For those who want to test out the new OpenSolaris Image Packaging System (IPS) but does not want to install the Indiana builds, here's how you can compile and build the IPS package, SUNWipkg and install it on a Nevada build. Not sure if it'll work on Solaris, do inform me if you know.
$ hg clone ssh://anon@hg.opensolaris.org/hg/pkg/gate/
or download source code at: http://src.opensolaris.org/source/xref/pkg/gate/src/
# make
# make install
cd ../SUNWipkg/../
pkgadd -d . SUNWipkg
Also, you can find instructions on how to create your own IPS repository here: http://blogs.sun.com/migi/entry/create_your_own_opensolaris_ips2

Tuesday May 13, 2008

OpenSolaris on USB

The Distribution Constructor project includes a few very useful scripts to make the latest OpenSolaris LiveCD build bootable on a USB drive. To get these scripts, download the source by following the instructions here, and naturally, you can only use them on a OpenSolaris build.

Once you got the source, cd to the tools directory and you'll find the usbgen and usbcopy scripts. The usbgen script will convert the OpenSolaris ISO image to a UFS image and modify the GRUB entries.

$ ./usbgen os200805.iso os200805.usb /tmp

Then use the usbcopy script to copy the image to your USB stick.

$ ./usbcopy os200805.usb

This script will format your USB drive, copy the USB content and install GRUB. Once complete, you're good to go! Reboot your system and set your BIOS to allow booting from USB. I used a 2GB USB stick on my Acer laptop and the OpenSolaris boot up in less than a minute and I believe it should be much faster than booting from the CD/DVD. However, the session is not persistent so any configuration and data will be gone once you reboot. I'm now trying to see how I can modify it to have session persistence. Someone has already come up with a script to add this, I will try it soon.


Thursday May 08, 2008

Mini Solaris on USB

Recently I've been playing around with bootable Solaris on a USB stick. We have a BioLiveCD project and booting from the CD is painfully slow, so I'm exploring having the BioLiveCD on a USB stick. I have played around with Milax which is very slim distro based on Nevada... just around 100MB, and boot up real fast too. Now I'm trying to customize the distro, e.g. adding and removing some of the applications, but encountered some problems so far. I'm using some of the nice scripts from the Distribution Constructor, which is meant for creating your own OpenSolaris distros based on Indiana. With the new OpenSolaris release and Distribution Constructor, it should be easy to create a slim distro that is bootable from USB. However I don't have the time to test it yet.

Here are some of the useful links that I'm found so far:
\* http://blogs.sun.com/jtc/entry/modifying_and_respinning_a_bootable
\* http://solaristhings.blogspot.com/2006/07/how-small-can-you-make-open-solaris.html
\* http://www.sun.com/bigadmin/content/submitted/boot_usb_flash.jsp

Monday May 05, 2008

Apache Hadoop

Apache Hadoop is gaining a lot of attention in the web community, especially support from Yahoo. It has a distributed filesystem and supports data intensive distributed application using the MapReduce computational model. It is been viewed as an important piece of the puzzle in Cloud computing, but can also be very useful to datamining type of applications. I think it won't be long before it catches attention in HPC, if it hasn't yet. With it's high scalability and fault tolerant nature, I think it has a lot of uses in HPC. Due to the data intensive nature, I wonder if there can be any value with using Hadoop with Lustre. If anyone has any insight to the I/O characteristics, I'll be glad to hear about it.

Friday Apr 25, 2008

China Shanghai ERC 2008

Just want to share the recent events that took place in Shanghai last week. We had a HPC track during the China ERC and also organized 2 workshops - Sun Grid Engine and Lustre - for our customers and partners.

This is me giving the presentation during the Sun Grid Engine workshop. :)

More photos and details of the events here.

Tuesday Apr 22, 2008

Sun's Cloud Computing for SaaS?

Recently there was much discussion in a new research project from Sunlab. Its call Project Caroline, a new horizontal scalable platform for SaaS that sounded a lot like cloud computing. Just what can Project Caroline do? Quoting from the article "Platform as a Service":

A hosting platform by Project Caroline enabled SaaS providers to:

1. Access a wide range of open source tools and resources through high-level abstractions (language-level Virtual Machines, networks, and network-accessible file systems and databases) to increase developer productivity while insulating code from infrastructure changes
2. Launch the service across performance-tuned, load-balanced infrastructure
3. Programmatically allocate, monitor, and control virtualized compute, storage, and networking resources
4. Automate service updates and platform usage dynamically—without human intervention
5. Draw on single-system view of a horizontally scaled pool of resources in order to meet the allocation requests of multiple applications

Project Caroline is opensourced under the GPLv2 license and the source code can be downloaded from the project website. It is still very much a research project, but I really hope that it will eventually be developed into a real Sun product.

Monday Apr 21, 2008

What is Cloud Computing?

Cloud computing is the current hype now. I attended the International Symposium of Grid Computing in Taipei and the Sun China Education and Research conference in Shanghai last 2 weeks and I'm not surprise that Cloud computing was one of a popular topics. Just what is cloud computing? And how is it different from Grid computing? I had the chance to chat with Prof Jin Hai, a highly regarded researcher in the Grid community and chief scientist of ChinaGrid, and learned about his perspective. He explained the 3 key ideas of Cloud computing:

1. Use of virtualization technology to provision and manage compute resources
2. Use of web 2.0 technology to create a dynamic and rich user experience
3. Modularize of small components/packages that makes deployment of Cloud computing technology simple and easy

To me, Cloud computing is just an evolution of Grid computing - from resource-oriented to user-oriented. Traditionally Grid research is mostly about 'HOW' - how to manage the resource, how to submit jobs, how to stage in data etc. Now, the focus is shifting to 'WHAT' - what can it do, what is the interface.

Tuesday Apr 08, 2008

Cloud Computing

I am invited to give a presentation in the industry track of a grid conference in Taipei this week. The theme of the presentation is on resource virtualization and on-demand computing and so I prepared the slides to introduce our own Sun Grid Compute Utility, or better known as Network.com. I also did a brief research on the activities and technologies in the commercial world today, and I found the term cloud computing popping out almost everywhere, just like when grid computing got very hyped up about 5 years ago. The first time I heard about the term is when Amazon started their Elastic Cloud service. People has explained that cloud computing is the next evolution of grid computing. The main difference? Grid computing is resource oriented, while cloud computing is user oriented, i.e. the interface to these resources. Think of a cloud as a pool of servers, and with cloud computing, we only need to care about how to use the services and do not have to care about what is happening in the backend. Grid computing has never really took off in the commercial world, but it is said that cloud computing will find success where the grid has failed.

Melvin Koh


« June 2016