### DTrace and moving/rolling averages

A client asked me the other day if DTrace could be used to maintain a moving average... well, yes, it can...

DTrace has very powerful built-in support for various kinds of aggregations, including the support for maintaining an average value, which makes writing this kind of code child's play. Here's a simple code-snippit that displays the average free memory since the dtrace script was started:

 ```#!/usr/sbin/dtrace -s #pragma D option quiet profile:::tick-1sec { @result["Overall average"] = avg(`freemem); /\* print current average \*/ printa(@result); } ```

This averaging function gives you an average since the DTrace script was started, which is great for scripts with a finite lifetime. But if you are monitoring a system over a long period, then the averaging function isn't so useful - the recent changes gets swamped out by all the older data.

For this kind of measure you typically use something like a rolling or moving average, which will give you the average over the last N samples. When a new sample comes along, you must throw away the oldest sample, add the new sample, and recalculate the average.

DTrace aggregations won't directly help you to maintain this working set, you need to maintain it yourself. You could store this working set in an aggregation, or an associative array, but since the size of the set you're averaging over is predefined, you might as well use a scalar array.

And instead of recalculating the average over the array at each sample, if you're smarter you maintain a running total of all samples in the set, and calculate differences ... when you replace an entry in the array you subtract it from the total, and when you add the new sample to the array you add it to the total. Then, the average is just the total divided by the number of samples.. much faster.

Here's some sample code demonstrating the maintaining of a moving average of free memory on the box:

 ```#!/usr/sbin/dtrace -s #pragma D option quiet inline int DEBUG = 0; /\* set to 1 for debug printfs \*/ inline int N = 10; /\* #elems in rolling average \*/ int buf[N]; /\* buffer to store last N elements \*/ long total; /\* running total of elements in buffer \*/ int i; /\* index into buffer \*/ int elems; /\* #elems so far in buffer \*/ profile:::tick-1sec { total -= buf[i]; buf[i] = `freemem; total += buf[i]; /\* increment #elems stored up to max N \*/ elems = elems + (elems < N); /\* increment circular index into elems array \*/ i = (i + 1) % N; /\* conditional debug printf \*/ DEBUG ? printf("AVG %d FREEMEM %d TOTAL %d elems %d i %d\\n", total / elems, `freemem, total, elems, i) :1; /\* print current rolling average \*/ printf("Moving average (samples : %d) : %d\\n", elems, total / elems); } ```

And here's the output. Notice how the number of samples tops out at N (N=10). If you plotted this on a graph using something like the DTrace Chime visualization tool , you'd expect to see more instability initially when the sample count is less than N, and then have the moving average settle down due to the increased damping of the averaging over the working set.

 ```root\$ ~/freemem.d Moving average (samples : 1) : 22892 Moving average (samples : 2) : 22891 Moving average (samples : 3) : 22891 Moving average (samples : 4) : 22891 Moving average (samples : 5) : 22891 Moving average (samples : 6) : 22891 Moving average (samples : 7) : 22895 Moving average (samples : 8) : 22899 Moving average (samples : 9) : 22900 Moving average (samples : 10) : 22899 Moving average (samples : 10) : 22899 Moving average (samples : 10) : 22899 Moving average (samples : 10) : 22899 ```

In order to avoid having to keep track of the last N samples, it's also possible to use slightly different averaging techniques which average over all the historical samples but with a predominance given to the more recent data, such as a Weighted moving average or Exponential moving average.

Below is simple DTrace sample code for an exponential moving average. You can adjust the weightings according to the sample period you wish to remain significant. Note that the value of the average must be initialized correctly otherwise the averaging will include a bogus "zero" sample.

 ```#!/usr/sbin/dtrace -s #pragma D option quiet long ema; dtrace:::BEGIN { ema = `freemem; } profile:::tick-1sec { ema = (`freemem + (13 \* ema)) / 14; /\* print current exponential moving average \*/ printf("Exponential moving average : %d\\n", ema); } ```

Comments are closed for this entry.