Dataspace Profiling - Look Inside the Machine!
By nk on Sep 06, 2005
Computer systems originally contained a central processing unit encompassing many boards (and sometimes cabinets), and random access memory that responded in the same cycle time as the central processing unit. This central processing unit (or CPU as we know today) was very costly.
Initially, bulbs attached to wires within the CPU aided programmer deduction in the identification of program behavior, in order to save precious processor time. These were the early profiling tools.
Computer languages, such as FORTRAN and COBOL, improved programmer productivity. Profiling libraries followed to breakdown the cost of the most precious resource on the system: the processor. Profiling associated processor costs with processor instructions and the source representation of those instructions: functions and line numbers. Programmer productivity climbed, as critical central processor unit bottlenecks were uncovered and resolved in program source.
Computers continued to evolve: the central processor unit shrank down to a single board, the mainframe led to the minicomputer, and multiple processor systems appeared. A disruptive technology, the microprocessor, appeared around this time. These cheap microprocessors were mass-produced with large-scale integration (LSI) and later VLSI.
Initially, microprocessors were inept compared to central processor units; but mass production laid the death knell for the discrete-logic central processor unit.
The "killer micro" debate raged in this time. Large numbers of cheap commodity microprocessors grouped to solve large problems only possible with mainframes. Sun offered a wide array of microprocessor-based systems, some more powerful than the largest mainframes of the day.
We are now in the mid-1990s. The acquisition costs of microprocessors comprised of a small fraction of overall system cost. The bulk of system cost was the memory subsystem (the cabinet, the interconnect, the controllers, the DRAM chips), and the peripherals.
In software, Solaris engineers were solving complex operating system scaling issues through deduction. In one case, through inference, an engineer found that one hot cache line bottlenecked all the microprocessors in the system. This brilliance in deduction was not missed; I realized we needed a tool to identify the now-new critical resource: the Memory Subsystem. This inflection point in technology drove me to invent Dataspace Profiling.
UltraSPARC-III was the first Sun processor that added support to monitor Memory Subsystem behavior. I worked with every processor team since then to include adequate support for Dataspace Profiling: the now-defunct Millennium processor; Niagara processors; UltraSPARC-III, IIIi and IV-based processors; and ROCK processors.
Computers evolved further: chip-multithreaded (CMT) processors have many Cores driving even more virtual processor Strands of instruction execution. These CMT processors offer fewer Memory Subsystem components than Strands of instruction execution. The performance-critical component in these systems is often the Memory Subsystem and not the Strands of execution.
Traditional profiling tools fail to detect these bottlenecks. Traditional profiling tools persist in monitoring the Processor Core, when the bottleneck is in the Memory Subsystem.
Dataspace Profiling monitors the both the Processor Core and the Memory Subsystem to identify machine bottlenecks and relates the solution back to Program Source and Program Address Space. All machine components are profiled with low-intrusion, on-the-fly, and related back visually to any program source and any program memory object.
Today, thanks to Performance Teams, Processor Teams, Compiler Teams, Solaris Teams, and the Sun Studio Analyzer Team, we are ready. Sun has incorporated Dataspace Profiling in Sun Studio.
Welcome aboard the experience of Dataspace Profiling: Look Inside the Machine!