Friday Jan 18, 2008

Averaging performance data

When you are optimizing benchmarks, the typical process involves running the same benchmark N times, and picking an arbitrary run of the benchmark (called a run) from these N runs to get the representative run. Another option is to average these N runs (creating a new run N') and pick that one as the representative run. In fenxi, we have discussed automatically averaging a bunch of runs. Performance data can be of two types
  • Numerical Data (Throughput, Response time, etc)
  • Textual Data (OS Patch level, syslog messages, etc.)
Averaging numerical data is very easy. Averaging textual data is not possible, or desired. However, since we are creating a new run N', we need to select textual data to be part of this new run. Which run do we pick it from? We are trying to solve this via the Fenxi project. If you have any thoughts or suggestions regarding this, please feel free to contact us.

Monday Jan 07, 2008

Fenxi - Performance analysis made easy

We just opensourced a nice performance analysis tool called Fenxi. Fenxi is a pluggable Java-based post-processing, performance analysis tool that parses and loads the data from a variety of tools into a database, and then allows you to query and compare different sets of performance data. Fenxi can also be used to graph data from performance tools. Fenxi (mandarin for analyze) is the successor to the Sun-internal tool called Xanadu. It is integrated with the Faban Benchmark harness.

If you have ever worked with performance data, you will pretty soon realize that
Performance Data can get huge.
Consider a benchmark running on a 64 core system with 100's of disks attached, with multiple network interfaces for 30 minutes. If you collect mpstat at 10 second intervals for the whole run, you end with more than 11,000 lines of data! (That is 400 CNTRL-F's if you are using VI in a regular sized termial). If you collect data from more tools like vmstat, iostat, trapstat, busstat, cpustat, etc you will end up with much more! Going through each of them line by line is not a scalable approach.
Performance Data is interrelated.
The tool outputs are just different views of the system behavior. We want to look at the system as a whole, rather than at its individual views. If your incoming network packets peaks, your interrupts in your mpstat most likely peaks. We may want to see if throughput was impacted as a result of a burst of writes to our disks, etc.
Some performance data makes sense visually.
For large data, a visual view gives a quick summary of the data. As Tim Cook states it, "the human brain is a powerful pattern-recognition machine - graphs allow you to spot things you would never see in numbers (like waves of CPU migrations moving across different cores)". Look at the bottom of the blog for more details
Performance Data should be queryable
We want to be able to query or ask questions to the performance data.  For ex, you might want to know "What are my hot disks?". Traditionally, people have answered such questions  by writing custom scripts using sed/awk/perl. This can get tedious very fast. We need a better way of asking questions. In Fenxi, we store the data in the database, and questions are formulated in SQL.
Performance Data should be comparable, averageable, etc.
Since I work in the performance group at Sun, we run a lot of benchmarks. Since the goal of [most] benchmarks is to maximize the performance of a system, we are always constantly trying out new changes to the system. Typically, we change a parameter and repeat the benchmark and see if it has improved performance.
Performance Data should be sharable.
We rarely work in isolation. We should be able to share data with our peers and collaborate on finding performance fixes.

Fenxi tries to solve all of the above problems.

Sample Graph

Sample Text

Fenxi text view

You can see a sample database run processed by Fenxi. I urge you to check it out!



« February 2016