Niagara, Bioinformatics and a 32-way BLAST trace
By sprack on Mar 13, 2006
As I touched on briefly in my last entry, we have looked at the scaling of a variety of different bioinformatics applications on Niagara (T1). We have seen great scaling, whether you consider scaling from a single 1-thread job to a single 32-thread job or to 32 1-thread jobs in parallel. I have some plots that show this nicely, and I will try to post them here in the next couple of days.
Following from this, I just wanted to quickly mention the availability of a 32-way BLAST trace. The trace leverages Sun's RST format (more details here) and can be downloaded via:
The trace is for a protein query (1887-letters), using the blastp program, against a Non-Redundant Protein database (2,244,936 sequences; 757,978,433 total letters). It a pretty substantial trace and it should be pretty useful to investigate the various pressures a key bioinformatics application places on a CMT processor (how do the various threads interact in the shared L2 cache, what are the offchip bandwidth requirements, how effectively do the threads co-exist on a VT core etc.). If time permits, I will try and post some of these details in the coming days.
While BLAST scaling has been examined before, this has typically been in the context of MP systems, where each thread leverages a separate L1 cache, L2 cache and inter-thread communication can be costly. With the introduction of CMT processors, it may be that we could achieve significant improvements in BLAST coding by taking into account that multiple threads now share the same L2 cache (and, in the case of Niagara, even the L1 caches are shared between 4 threads)....Just a thought.