Friday I heard about the Esquire cover for the first time. Wow. It is apparently the most expensive cover spread in history. It was only a matter of time. It's scary to think where this is going. Someday your cereal box will talk to you, even without the help of hallucinogens.
I'm out at the Ominture Summit '09 in Salt Lake City this week, and I'm very pleased to report that it doesn't suck. I was very worried that I was paying three grand for a two-day sales pitch, but I took a leap of faith and registered anyway. Turns out that feedback from previous years' conferences have inspired Omniture to dial back the product pitches and user-oriented content. This year, there has been a wealth of useful information on Internet marketing in general and very few product sales pitches.
And as an added bonus, as I'm writing this post, I just won a Corsair vintage radio from Vintage Tub & Bath for being quick to raise my hand.
The big take-aways for me have been:
I think I finally get Twitter. I singed up for a Twitter account, just to play with it, but I hadn't quite comprehended how Twitter is useful for a company or organization. Twitter has been a big theme at this conference. Everyone is trying to figure out what to do with this untested new marketing channel. Look for more from me there in the near future.
An important point that I missed before is that to be successful in the brave new world of social networking, you have to be a full participant. It's not enough to just broadcast. The communication has to be 2-way.
Building a community is a lot like creating viral media. There's no formula. You can't just create a community. All you can do is seed the ground and hope something grows. You can, however, do a lot to encourage the right things to grow and to help things along. In the end, it really does still come down to content.
Brand has to be pervasive. Martin Lindstrom calls it "smashable brand," meaning that the brand should be recognizable by even the smallest fragment. Is your website obviously your if you take away the logos and products? His new book, Buy-ology, looks pretty interesting. We all got free copies, so I'll let you know if it was after I've had a chance to read it.
Mobile is the next marketing frontier. Makes me glad I don't have a data plan.
One of the new features coming in Grid Engine 6.2u2 is job submission verification (JSV). The basic idea is that on both the client side and the server side, you have the ability to add scripts that can read through all the job submission options and accept, reject, or modify the job. JSV will open up a whole new world of possibilities that didn't exist before, and it will largely end the need for qsub wrapper scripts.
Because the server-side JSV scripts are executed by the qmaster for every job, there are performance considerations that must be taken into account. In order to limit the performance impact, the qmaster will manage the JSV scripts the same way load sensor scripts are handled, i.e. they are started once and kept alive as a separate process instead of starting them once per job. Nonetheless, what happens inside the scripts can still have a big impact on qmaster performance.
In a test, Roland (who still isn't blogging!) set up some DRMAA submission clients to hammer the master with job submissions. With no server-side JSV scripts, the clients were able to do 900 job submissions per second\*. With a simple server-side Perl JSV script to change the job name, the clients were only able to submit 700 jobs per second. A similar JSV script written in Tcl yielded the same results. With a similar JSV script written in Borne shell, however, the clients were only able to submit 3 jobs per second. No, that isn't a typo. While languages like Perl and Tcl are able to process numbers and strings natively, Borne shell has to rely on forking off other commands. Those forks are expensive and, even in a simple JSV script, yield major performance penalties. For these reasons, I actually recommend the Java™ language for writing server-side JSV scripts. Not only do you get access to the all the great built-in and external libraries, but you also get access to JGDI, letting you talk to the qmaster without forking an SGE command-line tool. (Thanks to Jython, JRuby, Rhino, et al, you can get the same benefits from languages other than just the Java language.)
Let me repeat that point to make sure it comes through loud and clear. If you use a shell script as a server-side JSV script, you will trash your cluster's job submission rate. That's not just for DRMAA jobs or for certain users. That's for the entire cluster.
On the client side, the story is a bit simpler but still similar. For every job submission, each client-side JSV will be started. (An array job counts as a single job in this regard.) That makes sense because qsub is started once for every job submission, and the JSV scripts can't outlive the qsub that launched them.
For DRMAA the implications are a little different. The JSV scripts are still started for every job submission, even though the DRMAA client remains running between submissions. (A DRMAA bulk job is an array job and hence still counts as a single job in this respect.) Roland used DRMAA clients in his test because they're very fast at job submissions. Using client-side JSV scripts affects that in much the same way as on the server side. And as with the server-side scripts, shell scripts have more of an effect than scripts written in a higher-level language. If you figure there's about 200ms overhead for every fork & exec, you could easily add several seconds to each job submission. A DRMAA client without client-side JSV scripts can easily submit over 100 jobs per second. With even a single client-side JSV script that runs no further commands, your submission rate drops to less than 5 jobs per second. Use with caution!
The JSV feature in 6.2u2 is extremely powerful, but as I've explained, you have to use it with care. When used with appropriate caution, however, JSV provides a fairly easy answer to some of the traditionally thorny issues for SGE administrators.
\* Roland achieved that submission rate with a Sun x4100M2 with two dual core AMD 2.8GHz processors running Solaris 10 as the master.
Some friends and I went skiing this weekend. It was the second ski trip of the season for us. On the first trip, we were all so out of shape that we ended up skiing a difficulty level below what we normally ski. I was so embarrassed by our full-on patheticism, that I went home and started working out in preparation for the next trip.
This time, it was completely different. This weekend, I couldn't find a black run that was challenging enough. (The resort doesn't have double-blacks.) I found myself skiing black runs like I would normally ski an easy blue. I had no idea being in shape could have such a profound effect on my skiing. I always knew I was getting worn out, and that it was affecting my ability to maintain control, but I never realized that I had so much more headroom in my skill if I could just get my out-of-shapeness out of the way. Wow! What a difference.
So, the moral of this story is that fat and lazy make poor partners for athletic activities. Go figure.
This article makes me very sad. First, it makes me sad because of the complete madness of it all. I'm really getting tired of reading about stupid people shooting each other for dumb reasons, especially when it's a kid behind the trigger. I would say that it's time to move back to Europe, but the nonsense is starting to propagate over there, too.
The other reason the article makes me sad is the self-serving, ultimately destructive plea from the defense team. Back when I was in junior high and high school, it was all the rage for the media to claim that role-playing games, like Dungeons & Dragons, made children worship Satan and kill themselves and their parents. Which is, of course, complete nonsense. I spent my entire youth playing every RPG I could get a group to play, and I've never been convicted of any serious crime. Now that RPGs have largely been replaced with MMORPGs (Massively Multi-Player Online RPGs) and other online FPSes (First Person Shooters), the media has turned its attention to video games. Am I honestly to believe that a perfectly normal, sane youth can spend enough hours making his/her thumbs sore that he/she will turn into a raving psychopathic killer? The thing the article leaves out is that this kid would have tried to kill his parents even if he'd spent all those Halo 3 hours watching Barney instead. (Actually, he might have tried sooner!) Unbalanced people do unbalanced things, with or without a video game to blame.
Owen Taylor (formerly) of GigaSpaces has put together an excellent proof of concept using GigaSpaces XAP and Sun Grid Engine. Using Sun Grid Engine, the PoC is able to grow and shrink the size of the GigaSpaces cluster dynamically according to changing load conditions. The PoC monitors GigaSpaces via JMX and then uses DRMAA to submit new instances to SGE or stop existing ones. Read more about it.
The last couple of weeks before the holidays I worked on an interesting project. It involved assembling pretty much everything Sun offers for HPC into a single coherent demo and throwing in Amazon EC2 to boot. This post will explain what I did and how I did it. Let's start at the beginning.
One of the new offerings from Sun is the Sun HPC Software. Beneath the excessively generic name is a complete, integrated stack of HPC software components. Currently there are two editions: the Sun HPC Software, Linux Edition (aka Project Giraffe) and the Sun HPC Software, Solaris Developer Edition. (A Sun HPC Software, Solaris Edition and Sun HPC Software, OpenSolaris Edition will be following shortly.) The Linux edition is exactly what the name implies. It's a full stack of open source HPC tools bundled into a Centos image, ready to push out to your cluster. The Solaris developer edition is a slightly different animal. It is targeted at developers interested in writing HPC applications for Solaris. The Solaris developer edition is a virtual machine image (available for VMware and Virtual Box) that includes Solaris 10 and a pre-installed suite of Sun's HPC products, including Sun Grid Engine, Sun HPC ClusterTools, Sun Studio, and Sun Visualization, all integrated together.
For this demo, I used the Solaris developer edition. The end goal was to produce a version of the virtual machine image that was capable of automatically borrowing resources from a local pool or from the cloud in order to test or deploy developed HPC applications. Inside the developer edition virtual machine, there are already two Zones that act as virtual execution nodes for testing applications. That's a nice start, but what about testing on real machines or a larger number of machines? That's where the resource borrowing comes in. In the end, I had a VM image that was capable of automatically borrowing and releasing resources first from a local pool and later from the cloud, on demand.
The first step was to get the developer edition running as-is. Sounded simple enough. The first wrinkle was that I was doing this demo on a Mac. The regular VMware Player is not available for Mac, so I had to download an eval copy of VMware Fusion. Once I had Fusion installed, I was able to bring up the developer edition VM without a hitch.
Step 2 was to get the VM networked. The network configuration for the developer edition beta 1 is such that the global and non-global Zones can see each other, but nobody can get into or out of the VM. Getting the networking working was probably the hardest part of the demo, and honestly, I can't tell you how I finally did it. Per the suggestion of the pop-up dialogs from VMware, I installed the VMware Tools in the VM's Solaris instance. That changed the name of the primary interface from pcn0 to vmxnet0, but didn't actually help. Solaris was still unable to plumb the interface. After twiddling the VM's network settings several times and doing several reconfiguration boots, I eventually ended up with a working vmxnet1 interface (and a dead pcn0 and vmxnet0). As usual in such adventures, I'd swear that the last thing I did before it started working should not have had any appreciable effect. Oh, well. It worked, and I wasn't interested in understanding why.
Now that I had a functional network interface, the next step was to reinstall the Sun Grid Engine product. The VM comes with a preinstalled instance, but this demo requires features not enabled in a default installation, like what the VM provides. I left the original cell (default) intact and installed a new cell (hpc) with the -jmx and -csp options. -jmx enables the Java thread in the qmaster that serves up the JGDI API over JMX. I needed JGDI so that the demo GUI that I was building could receive event updates from the qmaster about job and host changes. With Sun Grid Engine 6.2, I was unable to successfully connect to the JMX server unless I installed the qmaster with certificate-based security, hence the -csp option. After the installation was complete, I then had to do the usual CSP certificate juggling, plus a new wrinkle. In order to connect to the JMX server, I also had to create a keystore for the connecting user with $SGE_ROOT/util/sgeCA/sge_ca -ks <user>. There's a quirk to the sge_ca -ks command, though. By default, it fails, explaining that it can't find the certificates. The reason is that the path to the certificates is hard-coded in the sge_ca script to a ridiculous default value. To change it to the correct value, I had to use the -calocaltop switch. After the certificates were squared away, I installed execution daemons in both Zones. At least that part was easy.
The next thing I did was to create some more Zones. Yes, I know this demo was supposed to be using real machines from a local pool and the cloud. Because it's a demo on a laptop, the "local machines" had to be equally portable. Because of firewall issues, I also wanted to have a backup for the cloud. In an effort to be clever, I moved the file systems for the two existing Zones onto their own ZFS volumes. I wanted to create the new Zones as cloned snapshots of the old Zones. Unfortunately, it turns out that even though the man page for zfs(1M) says that it's possible, the version of Solaris installed in the VM is the last version on which it isn't possible. After chasing my tail a bit, I decided to just do it the old fashioned way instead of trying to force the new fangled way to work.
Now that I had six non-global Zones running, the next step was to get Service Domain Manager installed. It is neither installed nor included in the developer edition VM, so I had to scp it over from my desktop. Technically, I could probably have managed to download it directly from the VM, but I had already downloaded it to my desktop before I started. For the Service Domain Manager installation, I followed Chansup's blog rather than the documentation. Chansup's blog posts detail exactly what steps to follow without the distraction of all the other possibilities that the docs explain. Following the steps in the blog, I was able to get the Service Domain Manager master and agents installed with little difficulty. The hardest part is that the sdmadm command has extremely complicated syntax, and it took a while before I could execute a command without having the docs or blog in front of me as a reference. To prove that the installation worked, I manually forced Service Domain Manager to add one of the new Zones to the existing Sun Grid Engine cluster, and much to my shock and wonderment, it worked.
The last step of VM (re)configuration was to configure the Service Domain Manager with a local spare pool and a cloud spare pool and a set of policies to govern when resources should be moved around. This step proved about as tricky as I expected. As one of the original architects and developers of the product, I had a good idea of what I wanted to do and how to make it happen, but the syntax and the details were still problematic. The syntax was the first hurdle. The docs have issues with both understandability and accuracy, and Chansup's blog was too narrowly focused for my purposes. After I poked around a bit, I figured out how to do what I wanted, but actually doing it was the next challenge. What I wanted to do was create two MaxPendingJobsSLO's...
We interrupt your regularly scheduled blog post to bring you a public service announcement. Please, for your own well being and the well being of others who might use your software, test all of your code contributions thoroughly on all supported platforms, and have them reviewed by an experienced member of the development team before committing, especially if you're working on the Firefox source base. This point in the blog post is the last time I saved my text before completing the post. Before I could save it, Firefox segfaulted causing me to loose a significant amount of work. What follows is a downtrodden, half-hearted attempt to complete the post again. We now return you to your regularly scheduled blog post.
What I wanted to do was create two MaxPendingJobsSLO's for the Sun Grid Engine instance. The first would post a moderate need (50) when the pending job list was more than 6 jobs long. The second would post a high need (99) when the pending job list was more than 12 jobs long. I also wanted to have a local spare pool with a low (20) PermanentRequestSLO and a low FixedUsageSLO, and a cloud spare pool with a moderate (60) PermanentRequestSLO and a moderate FixedUsageSLO. The idea was that when the Sun Grid Engine cluster was idle, all the resources would stay where they were. When the pending job list was longer than 6 jobs, resources would be taken from the local spare pool. When the pending job list was longer than 12 jobs, additional resource would be taken from the cloud spare pool. When the pending job list grew shorter, the resources would be returned to their spare pools. In theory. (The philosophy of setting up Service Domain Manager SLOs is a full topic unto itself and will have to wait for another blog post.)
The first problem I ran into was that Service Domain Manager does not allow a spare pool to have a FixedUsageSLO. An issue has been filed for the problem, but that didn't help me set up the demo. The result was that I had no way to force Service Domain Manager to take the local spare pool resources before the cloud spare pool resources. The best I could do was set the averageSlotsPerHost value for the SLO for the MaxPendingJobsSLO's to a high number so that Service Domain Manager only would take hosts one at a time, rather than one from each spare pool simultaneously.
The nest problem was quite unexpected. With the SLOs in place, I submitted an array job with 100 tasks. I waited. Nothing happened. I waited some more. Still nothing happened. I turns out that the MaxPendingJobsSLO only counts whole jobs, not job tasks like DRMAA would. The work-around was easy. I just had to be sure the demo submitted enough individual jobs instead of relying on array tasks.
The last problem was one that I had been expecting. After a long pending job list had caused Service Domain Manager to assign all the available resources to the cluster, when the pending job list went to zero, the borrowed resources didn't always end up where they started. Service Domain Manager does not track the origin of resources. Fortunately, the issue is resolved by an easy idiom. I created a source property for every resource, and I set the value of the property to either "cloud", "spare", or "sge". I then set up the spare pools' PermanentRequestSLO's to only request resources with appropriate source settings. I also added a MinResourceSLO for the cluster that wants at least 2 resources that didn't come from a spare pool, just to be complete.
With the SLOs in place, the configuration actually did what it was supposed to. When the cluster had enough pending jobs, hosts were borrowed first from the local spare pool and then from the cloud. When the pending jobs were processed, the resources went back to the appropriate spare pools. To make the configuration more demo-friendly, I changed the sloUpdateInterval for the Sun Grid Engine instance to a few seconds (from the default of a few minutes). I also changed the quantity for the spare pools' PermanentRequestSLO's to 1 so that they would only reclaim their resources one at a time, rather than all at once. With those last changes made, I was ready to move on to the UI.
The idea of the demo was to present a clear graphical representation of what was going on with Sun Grid Engine and Service Domain Manager. From past experience building a similar demo for SuperComputing, I knew that JavaFX™ Script was the best tool for the job. (OK. It's not the best tool for the job in a general sense, but I'm a long-time Java™ geek, I don't know Flash, and I didn't have any budget to buy tools. Under those constraints, it was the best I could do.) Before I could get to building the UI, though, I first needed a JGDI shim to talk to the qmaster. Richard kindly provided me with some JGDI sample code, and from there it was pretty easy. The hardest part was figuring out what the events actually meant. In the end, my shim registered for job add events (to recognize job submissions), task modified events (to recognize job tasks being scheduled), and job deleted events (to recognize job completions). It also registered for host added and deleted events to recognize when Service Domain Manager reassigned a host.
With the shim working smoothly, I turned to the actual UI. Given the complexity of the animations that I wanted to do, it was shockingly simple to achieve with JavaFX Script, especially considering that there was not yet a graphical tool equivalent to Matisse for Swing. Every bit of it was hand-coded, but it still was fast, easy, and came out looking great. In the end, the whole UI, counting the shim, was about 1500 lines of code, and about 500 lines of that was the shim. (JGDI is rather verbose, especially when establishing a connection to the qmaster.)
And with that, I ran out of time. The next step would have been to actually populate the cloud spare pool with machines provisioned from the cloud. Torsten graciously provided me a Solaris AMI that included Sun Grid Engine and Service Domain Manager. The plan was to pre-provision two hosts to populate the pool and then create a script that would provision an additional host each time the cloud pool dropped below two hosts and release a host every time it grew larger than two hosts. Now that the demo has been presented, the pressure is off, and other things are higher priority. I do plan, however, to eventually come back and put the last piece of the puzzle in place.
Below is a video of the demo, showing how jobs can be submitted from the Sun Studio IDE, and how Sun Grid Engine and Service Domain Manager work together with the local spare pool and the cloud to handle the workload. The job that is being submitted is a short script that submits eight sleeper jobs. Because the MaxPendingJobsSLO ignores array tasks, I needed to submit a bunch of individual jobs, but I didn't want to have to click the submit button multiple times in the demo.
Filming the video turned out to be an interesting challenge unto itself. I did the screencap using Snapz Pro on the Mac. It has no problem with JavaFX Script or with VMware VMs, but it apparently can't film JavaFX Script running inside a VMware VM. I ended up having to twiddle the UI a bit so that I could run it directly on the Mac. That's why in the demo, when I switch from Sun Studio to the UI, I swap Mac desktops instead of Solaris workspaces. The voice over and zooming effects are courtesy of Final Cut, by the way.
Wondering what to get for that special someone who has everything? How about a sneak peek at soon-to-be-released Sun Grid Engine 6.2 update 2? That's right! Nothing says 'I love you,' like the SGE 6.2u2 Beta, and it's available just in time for the holidays. It makes a great stocking stuffer, and it's fun for the whole family. Download the SGE6.2u2 Beta today!
Like millions of other people out there, my first computer was a Commodore 64. (A C64c to be specific.) Apparently like millions of other people out there, I still think it was a great computer, and I still have one stuffed in the shed that I keep promising I'll pull out one day and use again. (It's actually one I picked up on eBay a while back. My original one exploded.) Back in college, I had replaced the bootup logo for my copy of Windows 95 with the Commodore startup screen.
There was a semi-recent post to TechCrunchIT by Steve Gilmore about how software companies are transitioning from big bang product releases to a rolling thunder model. In that post, Steve includes a video interview with Jonathan Schwartz. OK, it's really a puppet of Jonathan, and it's highly entertaining. I don't want to spoil the video for you, so I'll avoid details, but in the video Jonathan tells people to go to sun.com/ponytail. Well, I did, and all I got was a 404. Pretty lame. So, I contacted the web team that manages sun.com and suggested that they put something at that address. Lowe and behold, they did. It's a redirect to sun.com/software/opensource. It's really encouraging to me to see that Sun still has a sense of humor. Sometimes I wonder...
There's a new Grid Engineblog aggregator on planets.sun.com. The idea is to capture all of the relevant Grid Engine blogs in a single place for easy access. It's similar to the aggregator on the OpenSolarisHPC Community site, except that the HPC one also contains general HPC blogs and blogs on other Sun HPC products as well. If you have suggestions for a blog that should be included in either, let me know.