SPARC or INTEL ?
By Karim Berrah on Dec 17, 2009
As the INTEL CPUS are becoming more and more powerfull, like having more cores (real CPUs with their own resources), Simultaneous Multi Threading capabilities (more that one virtual CPU per core, with threads sharing the CPU resources), embedded memory controller (NEHALEM), 3 DDR3 Channels per socket (increasing the memory bandwidth per socket), scalability up to 2 sockets (and next step will be above 4 sockets), some people may wonder if it makes sense to continue to run business applications on SPARC Architecture.
So, here are a few remarks that you should consider, probably those remarks doesn't make sense for everybody in the same enterprise, but ...
As most of you certainly already know, the INTEL developement strategy is Tick-Tock, so in two years, there is a schrink derivative of the previous Core Architecture (example from 45nm to 32nm), leading for example to some energy efficiency improvements, and then a new micro architecture is developped (like the new NEHALEM platform). So, in two years, you have a new CPU for your actual business application platform. So, wow, that's great. So, most of the people keep in mind the following points:
- INTEL machines are less expensive than SPARC machines
- INTEL machines are faster than SPARC machines
Facts with INTEL NEHALEM :
- Two INTEL CPUs with the same CPU architecture but with different frequencies can not work together on the same platform (no frequency mixing on the same CPU generation, check point 2 on page 26 of the Intel® Xeon® Processor 5500 Series Datasheet, Volume 1). Even the latest Intel Itanium 9300 RISC (see point 1.3) does not support mixing, in the same system, various cache size or frequencies.
- Any new INTEL CPU is not compatible with the previous one (no generation mixing)
- Memory capacity of a system have now (starting from NEHALEM) to increase with the number of CPU (memory controllers are on the CPUs)
- Life time of an INTEL CPU is much shorter than a SPARC CPU (try to upgrade a server supporting an INTEL XEON 5200, announced less than two years ago), so an INTEL system might be less expensive at the acquisition phase, but you'll have to refresh your whole system to match the business needs "you'll hear: we will implement new Software Modules next year, so we will need more cores and double the memory in the next 6 month".
- An INTEL CPU on a 2 socket system if in fact different from an INTEL CPU on a 4 (or next coming 8) socket platform (5500 series VS 74xx/75xx series). Try to do a VMotion from a 2 sockets system to a 4 sockets system with VMware ... and the CPU is not only different in the packaging (bus freq./number of QPi links) but also in price ... (compare a 2.66 Quad Core socket in the 5x00 VS a 7x00 serie)
- If an INTEL NEHALEM does support various clock speeds and is energy efficient, why are there so many variation clock/power version of it (check page 14 of the INTEL roadmap), I would logically think that low frequency, medium frequency and high frequency should cover all the business needs ? ...
- Whatever consultants tell you regarding TCO, it's not the box price that matters, it's the price you are paying for the same performance added to the cost of the complexity you'll pay to manage that (manpower costs needed to manage, implement/integrate witht the existing boxes, control, changes, upgrade, interconnect, cost of powering and cooling). And when you compare performances, use the ones that represent as closer as possible the behaviour of your application (transactional/computation/network oriented/memory accesses).
Facts with SPARC64:
- Two similar SPARC64 CPUs with different frequencies can work together on the same platform (frequency mixing on the same CPU generation is allowed). Such feature is still missing within the latest Power7 (IBM) or the Itanium 9300 (Intel), even if they are both RISC architecture like SPARC.
- Asymmetric multicore systems (like SPARC64 systems) promise to use a lot less energy than conventional symmetric processors
- A new generation of SPARC64 is compatible with the previous one (generation mixing is allowed), and both can run together inside the same partition. Such feature is still missing within the latest Power7 (IBM) or the Itanium 9300 (Intel), which force you to do a complete "refresh" of your system.
- Memory capacity is not aligned with the number of socket (SPARC64), as the memory controller is not embedded on the CPU (half max CPU capacity with full max memory capacity is allowed). This is not the case of other RISC like Power7 (IBM) or Itanim 9300 (INTEL).
- Life time of SPARC64 CPUs is longer than INTEL CPUs, and SPARC64 technology has advanced RAS features not yet implemented in INTEL CPUS (read chapter 5 of this paper)
- Scalable Processor ARChitecture (SPARC) ensure scalability up to 512 threads at the CPU (socket), the core and the thread levels. So yes, the CPU architecture matters but also the OS matters. Check the AIX 6.1 relases notes to find out (surprisingly) the max. limit of processor cores and logical processors (page 23).
- The Solaris thread library is
somewhat more mature and uses third-generation function calls and
structures. The Solaris constructs are mean and lean, well-tuned to the underlying
SPARC hardware, whereas the .NET mechanisms are more suited to the
power provided by an Intel processor (as stated by Intel).
- Small but fully configured: for small systems (2/4 CPUs), INTEL plateforms are ideal, even with Solaris , but they should be configured with full configurations to avoid upgrade or vertical scaling problems, due to availability of the CPU/RAM in a time frame bigger than 2 years. So, implementing horizontal scaling for an application is certainly less expensive than vertical scaling at the HW level, but, when you add costs of software licences needed to "implement HA" on many small boxes, + manpower needed to maintain those small systems (provisioning, patching, monitoring, licencing, + indirect costs to have identical HW configuration), you'll finally end-up with many systems that after a few years becomes complex to maintain in a stable situation (except if you do regular whole refresh of your installed base "HW + Software").
- Scalability at the CPU level: for bigger systems, where horizontal scaling is not a solution (leading to many systems to manage/monitor/patch/upgrade), and where we need systems with more that 8 sockets (whatever is the number of cores), SPARC (Scalable Processor Architecture) is a perfect platform, not only at the HW level, but also at the kernel level (Solaris), who knows specifically how to handles the threads wathever they belong to a SPARC64 VI or VII architecture CPU on the same system.
- Virtualization: INTEL plaforms do not fit well in the virtualization world (frequency/generation mixing) where hypervisors are expected to be at the firmware level, hardware partitioning should be possible, and where enough threads are expected to handle an important number of interrupts (network or I/O accesses) when many virtual machines are concurrently running. Having a HW from a vendor, a hypervisor from another second vendor, necessarly increase complexity and brings potential problems that are not so easy to solve (HW or SW problem ?)
- Fast VS concurent: Having fast CPUs can help, but as long as memory accesses are still slow, the only way to improve efficiency of the CPU (and avoid him to do nothing while waiting for memory access) is to virtualize the CPU, and create concurrence virtual processors (aka threads) executed by a single core. SPARC CMT is an example, where in 2005, 4 threads could be executed by a single core (SPARC T1), allowing a single 8 cores CPU to have enough resource to handle a large number of I/O. As you have maybe noticed, even INTEL is going now in this direction where a NEHALEM core can now run 2 SMT threads.
- Compromises: In 2010, the Power7 (IBM) is finally able to run 4 threads per core. And as long the number of threads per core is expected to to increase, you will see a decrease of the frequency, and some "compromise" on the internal cache sizes. Example: the X5272 @3.4 GHz, 2 cores 6MB of L2, no L3, 80W becomes a X5492 @3.4 GHz, 4 cores, 12MB of L2, but 150W. Now with NEHALEM, It's a new architecture, an L3 cache has been introduced, shared by the cores, and L2 reduced to 256K per core. So a W5580 @3.2 GHz has 4 cores, 1MB of L2, 8MB of L3, 4 cores, 8 threads, 130W. So adding 2 threads per core on NEHALEM leaded to add L3, reduce L2, and reduce frequency from 3.40 GHz to 3.20 GHz. Let see what will happen when INTEL will integrate 4 threads per core.Same compromize apply to the Power7 when compared to the Power6.
- Confidence: I was listening to a webcast called: Secure Your Futur Now: Run Your Business Critical Applications On Industry Strandard Platforms (HP+Microsoft+Intel), and at the end, I asked myself the following: should I feel safe when running an SAP ERP on Windows OS, running on HP systems powerd by INTEL CPUs, or should I run my ERP on an optimized, integrated and single shop supported solution, especially when in comes to integration, evolution, support ...
Keep in mind:
So, before choosing between INTEL or SPARC, think about:
- Business needs (what do I really need today and tomorrow)
- Budget (acquisition, integration, and operationnal costs)
- Evolution (when will I have to evolve, and how)
- Three years strategy: do more with less (virtualisation), be more flexible (use your actual assest). Choose the right technology that allows you to benefit in the futur, from the next generation of HW components, without compromizing your actual assets.
- Systems Architecture: there is no universal CPU architecture (RISC or CISC) that fit to data oriented workloads (single performance and huge internal caches), network or I/O oriented workloads (multithreaded), that scale to many CPUs in a linear way (efficient system interconnect), support virtualisation (like CMT), ensure binary compatibility between generations (like SPARC) and is cheap ... you have to choose the correct CPU architecture per application profile.
Well common sense as usual:
- Virtualize where you can (avoid systems proliferation) to be independant from the HW achitecture, this will help you to be more flexible. Hardware Partition (on SPARC64) and LDOM (SPARC CMT) will ensure you to be independant from the CPU version/frequency and allows you to run many "systems" on the HW.
- Use INTEL when you have no choice due to single thread perf. issues, but with no vertical scalability goals. Source code binary compatibility is guaranteed with Solaris. If vertical scaling is a goal, go for SPARC64
- Keep as low as possible the number of kernel versions (example Solaris Containers/LDOMs technologies), this will help you to upgrade in a faster way and have a consistent data center
- Keep as low as possible the number of software layers and vendors in your applications (increasing them also increase the complexity when you'll have to upgrade layer Li to version V+1, without impacting layer Li+1 ...) this will help you to smartly evolve, and simplify troubleshooting and dealing with many software support centers at the same time
- Prefer a low-level hypervisor (firmware level) if you need virtualization: you'll reduce the overload, the complexity and the Virtualisation licence costs
- Always think on having a High-Availability aware solution. If not, may be virtualization can help you to achieve Reasonable-Availability
Just some references used for this blog:
- The SPARC architecture Manual V9 and the SPARC64 VII extensions
- The OpenSparc (T1/T2) Internals
- The Intel Tick-Tock strategy and the latest Intel road-map (2010/Q1)
- The Intel XEON 5500 (CISC) and the Intel Itanium 9300 Datasheets
- Maximizing Power Efficiency with Asymmetric Multicore Systems
- The Solaris OS and the Intel Nehalem
- Multithreading under Solaris and Microsoft .Net
- AIX 6.1 Release notes