Tickless Clock for OpenSolaris
By Josh Simons on Apr 15, 2009
I've been talking a lot to people about the convergence we see happening between Enterprise and HPC IT requirements and how developments in each area can bring real benefits to the other. I should probably do an entire blog entry on specific aspects of this convergence, but for now I'd like to talk about the Tickless Clock OpenSolaris project.
Tickless kernel architectures will be familiar to HPC experts as one method for reducing application jitter on large clusters. For those not familiar with the issue, "jitter" refers to variability in the running time of application code due to underlying kernel activity, daemons, and other stray workloads. Since MPI programs typically run in alternating compute and communication phases and develop a natural synchonization as they do so, applications can be slowed down significantly when some nodes arrive late at these synchronization points. The larger the MPI job, the more likely the this type of noise will cause a problem. Measurements have shown surprisingly large slowdowns associated with jitter.
Jitter can be lessened by reducing the number of daemons running on a system, by turning off all non-essential kernel services, etc. Even with these changes, however, there are other sources of jitter. One notable source is the clock interrupt used in virtually all current operating systems. This interrupt, which fires 100 times per second, is used to periodically perform housekeeping chores required by the OS. This interrupt is a known contributor to jitter. It is for this reason that IBM has implemented a tickless kernel on their Blue Gene systems to reduce application jitter.
Sun is starting a Tickless Clock project in OpenSolaris to completely remove the clock interrupt and switch to an event-based architecture for OpenSolaris. While I expect this will be very useful for HPC users of OpenSolaris, HPC is not the primary motivator of this project.
As you'll hear in the video interview with Eric Saxe, Senior Staff Engineer in Sun's Kernel Engineering group, the primary reasons he is looking at Tickless Clock are power management and virtualization. For power management, it is important that when the system is idle, it really IS idle and not waking up 100 times per second to do nothing since this wastes power and will prevent the system from entering deeper power saving states. For virtualization, since multiple OS instances may share the same physical server resources, it is important that guest OSes that are idle really do stay idle. Again, waking up 100 times per second to do nothing will steal cycles from active guest OS instances, thereby reducing performance in a virtualized environment.
While it is true I would argue that both power management and virtualization will become increasingly important to HPC users (more of that convergence thing), it is interesting to me to see that these traditional enterprise issues are stimulating new projects that will benefit both enterprise and HPC customers in the future.
Interested in getting involved with implementing a tickless architecture for OpenSolaris? The project page is here.