CMT and software scaling -- Ease of Scalability
By sprack on Aug 16, 2007
While the vast majority of key commercial applications (including databases, webservers and application servers) have been carefully optimized over the years to ensure scalability on traditional MP systems, this can be a time consuming and costly process. Poor scalability is frequently observed due to software design issues, with problems such as hot locks and data sharing (both real and false) being common culprits. Traditionally, dealing with these problems has required detailed knowledge of both the application and the target system.
On CMP systems such as UltraSPARC T1 (Niagara 1), because threads share a common L2 cache, these problems have a much smaller impact on scalability. For instance, consider the code fragment in Fig. (i). In this example, each thread processes a separate array, accumulates a local total and then updates the global accumulation total. To ensure multiple threads can not update the global total in parallel, the update is protected via a mutual exclusion lock. Figure (ii) illustrates aggregate throughput as the number of threads is increased and presents results for 2 systems: an 8-core UltraSPARC T1 CMP system and a traditional 8-processor UltraSPARC SMP system. Figure (ii) illustrates that, as expected, performance on the traditional SMP system scales poorly as the number of threads is increased — due to the overheads associated with continually migrating the lock between processors. In contrast, on the UltraSPARC T1 CMP system, throughput scales almost linearly as the number of worker threads is increased — this is to be expected as the lock is retained in the T1's shared L2 cache for the duration of processing. This sharing of the hot lock across multiple processors is clearly problematic. While the example code is very simple (and the array size small), it is apparent that these problems can still have a noticeable impact on application scaling for more complex codes.
It is apparent that the impact of common scalability problems can be much less pronounced on CMP systems, improving application scalability and significantly simplifying MT application development.
[Abstracted from the IJPP paper]