And now for something virtually different...
By Jsavit-Oracle on Jan 27, 2007
Traditional virtual machines timeslice physical CPUs among multiple virtual machines, intercepting instructions that change system state or do I/O, and emulating them as needed. This is based on the historical design of computer systems where physical CPUs are relatively rare and expensive (hence must be time-multiplexed), and that state-changing events for one virtual machine must not affect others (hence must run without full machine access privileges, and require trapping and emulation of such functions). As I've been outlining, this is complicated and expensive. Even simple timeslicing between virtual machines can cost hundreds of clock cycles, because cache and TLB contents have to be discarded.
The T1 chip in Sun's "Niagara"-based systems (T1000, T2000, and others to come) turns the assumption of expensive/rare CPUs upside down. This processor's Chip Multi Threading (CMT) design provides up to 32 logical CPUs ("strands") in a 1 or 2 rack unit, low-cost server. Now, CPU strands are plentiful and cheap. Instead of timeslicing a few CPUs between VMs, just give each virtual machines one or more dedicated logical CPUs for its own use. That is the basis of logical domains (LDoms): every domain has its own assigned CPUs (roughly 3% granularity of the entire box CPU count) which can be dynamically added or removed to a Solaris instance. Each domain also has its complement of disk, network, and cryptographic assets. Everything is assigned by a control domain, and virtual network and disk I/O is provided by bridged access service domains.
This gives us several important benefits right away: since each domain has its own logical CPUs, it can change its state (such as enable or disable interrupts) without having to cause a trap and emulation. After all, it owns the CPU and its interrupt mask all by itself. That can save thousands of context switches per second. Second, since each CPU strand has its own private context in hardware, the T1000/T2000 can switch between domains in a single clock cycle, not the several hundred needed for most virtual machines.
Typically that happens when a domain references memory that is not currently in cache. Fetching contents from RAM to the processor (all vendor's processors, not just this one!) can take many clock cycles during which a logical CPU stalls execution of the single instruction causing the cache miss. By switching to another CPU strand on the same physical CPU core, the T1000/T2000 lets another logical CPU continue instruction processing, during time that is "dead time" on most processors. On most existing CPUs, cache misses result in dead time - but on the Sun T1 chip, that time can be used to continue processing other work. This is the essence of CMT's "Throughput Computing" that makes the T1 chip so poweful.
Next time, some more information on how LDoms works and is used.