Monday Nov 14, 2011

Latency Matters

A lot of interest in low latencies has been expressed within the financial services segment, most especially in the stock trading applications where every millisecond directly influences the profitability of the trader. These days, much of the trading is executed by software applications which are trained to respond to each other almost instantaneously. In fact, you could say that we are in an arms race where traders are using any and all options to cut down on the delay in executing transactions, even by moving physically closer to the trading venue.

The Solaris OS network stack has traditionally been engineered for high throughput, at the expense of higher latencies. Knowledge of tuning parameters to redress the imbalance is critical for applications that are latency sensitive. We are presenting in this blog how to configure further a default Oracle Solaris 10 installation to reduce network latency.

[Read More]

Tuesday Sep 27, 2011

Talend's new data processing engine on Sun Blade X6270

Having the chance to test the brand new Sun Blade X6270 server based on the Intel Xeon X5500 series processors, I asked one of our ISV partners, Talend, an open source ETL (Extract Transform & Load) solution provider, if they where willing to do some benchmarking with me.

The timing was perfect since Talend has just rewritten some parts of their ETL engine, that will be included in the upcoming version, in order to make a better use of modern CPU multi threading capabilities.

During the development they had benched their application on a two socket Xeon 5320, and where very interested in seeing how the the new Intel Xeon 5500 would perform.

Test descriptions

We used DBGEN v2.8.0, a database population program that generates files to be loaded in a database tables. In our case we will generate moderately to very large files, and will process them directly (no use of a database system) as simple flat files. Also, we will be only using the file called “lineitem.tbl” which represents a list of order item lines having the following structure:

DBGen Structure

For each benchmark run we perform three tests, each applying a different type of processing on the file:

  • Sort:
    We will sort the entire file by date, on the 11th column (L_SHIPDATE: see above in red)

  • Count:
    Count the number or order lines by shipment mode ( L_SHIPMOD: see blue column above) and the year of the shipment date. ( L_SHIPDATE: see above in bold red )

  • Average:
    Average discount (L_DISCOUNT) for each item (L_PARTKEY)

DBGEN uses a scaling factor representing the total size of all the tables generated. For this test we only use the file named «lineitem.tbl». The table bellow size and number of lines in the «lineitem.tbl» file given each scaling factor.

As you can see we start quite small, by processing a file with 6 million lines (only !) and go all the way to processing finally 3.3 Billion lines in a single file.


Number of entries



6 Million

740 MB


60 Million

7,4 GB


600 Million

74 GB


1,8 Billion

225 GB


3,3 Billion

415 GB

Hardware Configurations

The following table shows the hardware configurations used for the tests (referred to as X6270), and also the vanilla Xeon bases box used by Talend (referred to as Bi-Xeon)





2 x Xeon 5520 quad core with HyperThreading & Turbomode on (2,26GHz)

2 x Intel Xeon 5320 quad core (1,86 GHz)



    4 GB DDRII

Internal storage

1 x 136 GB 15K tr/min

3 x 250 GB and 2 x 320 GB Seagate 7200 tr/min (all on ext3)

  • 1 x 250 GB for system and temporary files

  • 1 x 320 GB for input files

  • 1 x 320 GB for output files

External storage

  • 3 volumes of 4 disks using RAID 0 (stripping), 544 Gb each.

  • A ZFS pool for each group.


Operating System

Solaris 10 update 6 (aka. 10/08)

Debian GNU/Linux Etch with Linux 2.6.18 (i686)

With respect to the CPU, the X6270 configuration is obviously much more powerful, especiall given the amount of RAM, and the external storage. However the tests proved to be more CPU and IO bound than memory bound. Even if obviously the amount of memory does make a difference, the test will give us some indications about the extra performance brought by the Xeon 5500.

In order to get closer to the Bi-Xeon configuration, we did also two set of tests on the X6270: with (referred to as X6270-Ext) and without the external storage (Referred to as X6270-Int).

In the second case, we are even in a less favorable position than the Bi-Xeon that uses 3 disks vs. a single disk for the X6270.


The table bellow presents the final results of the tests done on the three configurations. It's interesting to note a couple of things:

  • When processing a file, at least three times the disk space is needed to proceed. For this reason, we could only process a 7.4 GB file for the X6270-Int (Single internal 136 Gb in the server)

  • Given the much higher processing time needed on the Bi-Xeon, we didn't even try going further than 74 Gb.

  • We pushed the X6270-Ext up to processing a 415 GB file, and could have reasonably gone all the way to 1 Tb if we were not limited by disk space.

Result Table


On the CPU bound tests (Average test) we can clearly see a 32% to 60% boost of performance on the new Intel Xeon 5500 compared to the older generation (depending on the size of the file).

Of course the processor matters, and we saw that on the more CPU bound processing, it has a great impact. But what we can also see, and that's not new, is that data hungry processors need to be fed with data, good and fast. To that respect the speed of the IO sub system is very important. Obviously working with files over 400 Gb put a lot of pressure on the IO, and plugging a professional external storage device, just makes a huge difference (in our case anyway)

As you can see on the SORT test (scale 10) we get a 290 % boost with the Intel Xeon 5500. Once we use the external storage, that performance sky rockets to 1075 % (more than 10x the performance) !

We could of course go on along time analyzing all the figures, with different file sizes, but without pushing the analysis very far, it's plain to see the performance gain we get with this new processor alone, not to mention if we also take care of the IO sub system.

The Intel Xeon 5500 based Sun servers, such as the Sun Blade X6270 we just tested, enhanced with an external storage device such as the Sun StorageTek 2540 seems to be a killer combination for large data processing.

Wednesday Mar 23, 2011

Traffix scales on Solaris Sparc

Traffix Systems is leading the control plane market, with a range of next-generation network Diameter products and solutions --Diameter is an authentication, authorization and accounting protocol for telco networks, and a successor to RADIUS.

The amount of Diameter signaling in LTE and 4G networks is unlike anything telecom operators have seen or been confronted in the past. It is estimated that there will be up to 25x more signaling per subscriber compared to legacy and IN networks. As a result, network operators moving to LTE are finding it progressively more difficult to manage their core network architecture and Diameter signaling as it becomes increasingly complex to maintain, manage and scale.

With these challenges in mind, and as part of the on-going engineering collaboration between Traffix Systems and Oracle's ISV Engineering, we investigated which Oracle technologies could help decrease and manage the complexity. The first thing we looked at was the SPARC Enterprise T-Series systems…

[Read More]

Tuesday Mar 01, 2011

Talend Integration Suite optimized on Solaris

Continuing with the spirit of the Tunathon program --an innovative enginneering program to study and tune application performance on Solaris, run at Sun Microsystems in the early 2000's--, we at ISV Engineering are still running today "Tunathon" projects with our partners, i.e. tuning their application on Solaris --we have about 5 in flight right now. Tunathon efforts are in fact more and more relevant as computers are becoming more complex, scalable and heteregeneous --think e.g. of a 4-socket quad-core dual-thread system with extra GPU engines. Developers have the impossible job to release new business logic in their code, faster and faster, while being fully optimized and scalable on systems that a developer never gets his hands on to test scalability to start with, anyway. And the programming frameworks, good for developer productivity and code quality, comes as additional layers that can make debugging and optimization a real nightmare.

Recently, Talend, a fast growing ISV positioned by Gartner in the “Visionaries” quadrant of the “Magic Quadrant for Data Integration Tools”, contacted us to report a serious performance issue at one of their customers, a large bank, using the Talend Integration Suite application on a large 32-way quad-core SPARC M-Series server. Although fully multi-threaded, the software simply did not scale on such a large system. We got on it right away, set up a 128-thread Sun T5140 system in our Lab to reproduce the problem, and took a closer look at the Java code…

[Read More]

Monday Feb 14, 2011

Performance monitoring using hardware counters : Releasing HAR 2.1

Thanks to the contribution of Claude Teissedre here at Oracle ISV Engineering, we are happy to announce the 2.1 release of the Hardware Activity Reporter (HAR) performance monitoring tool, featuring support for the SPARC T3 processor. Both Sparc and x86 binaries of HAR 2.1 are free for download at

The HAR 2.x source code continues to be available under the CDDL 1.0 license on Project Kenai.


How open innovation and technology adoption translates to business value, with stories from our developer support work at Oracle's ISV Engineering.



« April 2014