The Hare and the Tortoise [X6250 vs T6320] or [INTEL XEON E5410 vs SUN UltraSPARC-T2 ]

The Hare and The Tortoise
View Benoit's profile on LinkedIn


"To win a race the swiftness of a dart ... Availeth not without a timely start"

LeLievreEtLaTortue
 

The tree on yonder hill we spy [Sun Blade 6000
Modular Systems]
The Sun Blade 6000 chassis support up to ten blades in a ten rack-unit chassis and is extremely popular due to its versatility. In fact, you can test your application today on four different chips within the same chassis. (UltraSPARC-T1 [T6300], UltraSPARC-T2 [T6320], AMD Opteron dual-core [X6220] and INTEL Xeon dual-core and quad-core [X6250]. While the Opteron and T1 blades have performance characteristics well defined by now, I was really curious to see how the new T2 blade will perform when compared to the Xeon Quad-Core.

A grain or two of hellebore [Chips & Systems]
In term of chips details, the T2 and Xeon are diverging. The three key differences are the total number of strands [16 times for the T2], the CPU frequency [1.66 times more for the Xeon] and the L2 cache size [3 times more for the Xeon].

This simple table illustrate their key characteristics :

Feature
INTEL Xeon E5410
SUN UltraSPARC-T2
Process
45 nm
65 nm
Transistors
820 million
500 million
Cores
4
8
Strands/core
1
8
Total #strands
4
64
Frequency
2.33Ghz
1.4Ghz
L1 cache
16KB I. + 16KB D.
16KB I. + 8KB D.
L2 cache
12 MB
4 MB
Nominal Power
80 W
95 W

This table makes it clear that predicting response time or throughput  delta between this two chips is a risky endeavor !

X6250T6320


Following this two pictures [X6250 and T6320], here is our hardware list :

Role Model
System clock
Sockets@freq
RAM
T2 blade
T6320
N/A
1@1.4Ghz
32 GB
Xeon blade
X6250
1333 Mhz
2@2.33Ghz
32 GB
Console
X4200
1000 Mhz
2@2.4Ghz
8 GB


I dare you to the wager still [Benchmarks]
I ran several benchmarks (including Oracle workloads) on all type of blades, but for the purpose of this article I will present only the two simple micro-benchmarks iGenCPU and iGenRAM.

The iGenCPU benchmark is a JavaTM-based CPU micro-benchmark used to compare the CPU performance of different systems. Based on a customized Java complex number library, the code is computing Benoit Mandelbrot's highly dense fractal structure using integer and floating-point calculations. (50%/50%) The simplicity of the code as well as its non-recursivity allow a very scalable behavior using less than 128 Kb of memory per thread. The exact throughput in number of fractals per second and average response times are reported and coalesced for each scalability level.

The iGenRAM benchmark is based on the California lotto requirements. The main purpose of this workload is to measure multi-threaded memory allocation and multi-threaded memory searches in Java. The first step of the benchmark is for each thread to allocate 512 Megabytes of memory in a 3-dimensional integer arrays. The second step is to search through this memory to determine the winning tickets. The exact throughput in lotto tickets per millisecond as well as the average allocation and search time are reported and coalesced for each scalability level.

 For this test, we used Solaris 10 Update 4 and Java version 1.6.1.

And list wich way the zephyr blows [Results]

Here are the iGenCPU throughput & response time :

iGenCPU_blade

Notes :

1-The Hare [X6250] is starting very fast but gets tired at 8 threads and really slow down at 12 threads
2-The Tortoise [T6320] reach more than twice the throughput of the Hare at 60 threads.
3-Single threaded average transaction response time is two times better on the Hare.

Now let's look at the iGenRAM results :

iGenRAM_blade.


Notes :

1-Phenomenal memory throughput of the Hare [X6250] at low level of threads. But in peak, the Tortoise [T6320] achieve 11% more throughput
2-When the Hare is giving up (~7 threads), the Tortoise is just warming up, reaching its peak throughput at about 40 threads.
3-Single-threaded, it takes 9 ms to allocate 512 Mb on the Hare, 33 ms to do the same thing on the Tortoise.
4-Single-threaded, it takes 5 ms to search through 512 Mb on the Hare, 34 ms to do the same thing on the Tortoise.


Conclusion

The race is by the tortoise won.
Cries she, "My senses do I lack ?
What boots your boasted swiftness now ?
You're beat ! and yet you must allow,
I bore my house upon my back."

See you next time in the wonderful world of benchmarking....
Special thanks to Mr Jean De La Fontaine [1621-1695]


<script src="http://www.google-analytics.com/urchin.js" type="text/javascript"> </script> <script type="text/javascript"> _uacct="UA-917120-1"; urchinTracker(); </script>
Comments:

i have X6250 32GB ram dual.

Posted by Egitim on December 09, 2010 at 10:57 PM PST #

Post a Comment:
  • HTML Syntax: NOT allowed
About

mrbenchmark

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today
News
Blogroll
deepdive

No bookmarks in folder