Oracle Solaris Studio Secret Sauce: Ferociously tuned and Parallel Scientific Libraries
By vtatkar on Jun 28, 2010
One of the lesser known "secret sauces" of Oracle Solaris Studio is perhaps one of its easiest-to-use and highest performance components: Performance Library (what we commonly call Perflib). Sun Performance Library is a set of optimized, high-speed mathematical subroutines for solving linear algebra and other numerically intensive problems. Sun Performance Library is based on a collection of public domain applications available from Netlib . Sun has enhanced these public domain applications and bundled them as the Sun Performance Library. Sun ensures that the performance of each routine is optimal for the underlying hardware and that the routines are parallelized to take advantage of multiple cores.
If words like BLAS, LAPACK, FFTPACK, SuperLU, ScalaPACK, SparseBLAS and SPSOLVE get you excited or at least curious, read on. For the rest of you, there are only a couple of headliners, I'd like you to remember:
- Sun Performance Library comes optimized for every Sun HW platform. This means there are optimized versions for for V8, sparcvis, sparcvis2, and sparcfmaf architectures on the SPARC side and there are also optimized versions for x86/x64 architectures, for AMD/Opteron, AMD/Barcelona and Intel/Xeon.
- Sun Performance Library works on Solaris SPARC, Solaris x86/x64, OEL, RedHat and SuSE
- These highly optimized versions are hand-tuned for the best performance. That means linking into these routines will automagically give you scalability across multiple cores and the best possible performance on each HW brand you could be running.
- Scalability across multiple cores is automatically guaranteed by the parallelized routines, which means code can automatically scale up on newer machines without having to parallelize code by hand (a very tedious task, in most cases).
For the die-hards who want to know more, here is a classification of the kind of Linear Algebra and Numerical solvers that are part of Perflib:
- Elementary vector and matrix operations - Vector and matrix products; plane rotations; 1, 2-, and infinity-norms; rank-1, 2, k, and 2k updates
- Linear systems - Solve full-rank systems, compute error bounds, solve Sylvester equations, refine a computed solution, equilibrate a coefficient matrix
- Least squares - Full-rank, generalized linear regression, rank-deficient, linear equality constrained
- Eigenproblems - Eigenvalues, generalized eigenvalues, eigenvectors, generalized eigenvectors, Schur vectors, generalized Schur vectors
- Matrix factorizations or decompositions - SVD, generalized SVD, QL and LQ, QR and RQ, Cholesky, LU, Schur
- Support operations - Condition number, in-place or out-of-place transpose, inverse, determinant, inertia
- Sparse matrices - Solve symmetric, structurally symmetric, and unsymmetric coefficient matrices using direct methods and a choice of fill-reducing ordering algorithms, and user-specified orderings
- Convolution and correlation in one and two dimensions
- Fast Fourier transforms, Fourier synthesis, cosine and quarter-wave cosine transforms, cosine and quarter-wave sine transforms
- Complex vector FFTs and FFTs in two and three dimensions
- Interval BLAS routines
- Sorting operations
Taking full advantage of the increased accuracy, performance and parallelism of these routines often requires code change. However, in many cases, such code change can result in more readable code as well (here is a good example of that). The performance improvements are often dramatic, and well worth the time taken to change code to take advantage of these routines.
Want to know more? There are several places you can look:
- A recently published paper on the developer.sun.com site looks at Perflib in great depth for uniprocessor, multicore and distributed systems and includes code examples illustrating best practices. This paper is a MUST-READ for developers interested in Linear Algebra and Numerical Solvers.
- You can read all the details about Performance library with this User Guide documentation.
- Paul Hinker's Weblog
talks about many interesting aspects of Sun Perflib
- You can get a lot more information about BLAS, LAPACK and Open Source Numerical solvers from these Netlib locations (common entry point for all).