By arnaud on Dec 02, 2009
With Directory Proxy Server, regardless of the version, investigating traffic can get:
- a) real tricky
- b) time consuming
- c) confusing
- d) all of the above
and the answer is ... rolling drum ... d) !
So here is a script that you can feed your DPS access log to. It will output a CSV file that you can then load in your favorite spreadsheet software or just graph with tools like gnuplot and the like... it just will make your life easy...er.
Bird's Eye View
Disclaimer: It's not as anywhere as clever as logconv.pl for Directory Server, it only munches the data so that YOU can more easily spot issues or identify patterns. So what does this script produce in the end ?
It will take your DPS 6.x/7.0 access log in and output three csv files, one with the transaction volumes (suffixed "tx"), one with the average response times (suffixed "avg") and finally one with the maximum response time over a minute (suffixed "max"). Why not all in one file? I did initially but in a csv it turned out to really not be practical. So at least when you open up one of these files you know what you're looking at.
Since I really started this initially to simply be able to "grep" a file on a windows system, I really had no plan and no idea it would end up in a tool like this. All that to say that I wrote in python instead of our customary Java tools. At least it has the merit of existing so you don't have to start from scratch. So you'll need python, at least 2.4. If you're on Solaris or Linux, you're covered. If on windows, simply download your favorite python, I have installed the 2.6.4 windows version from here.
How Does It Work
0 To 60 In No Time
python dps-log-cruncher.py access
The Rest Of The Way
-c : break up statistics per client
-s : break up statistics per back-end server
-f hh:mm: start parsing at a given point in time
-t hh:mm: stop parsing after a given point in time
-h : print this help message
-v : print tool version
split the output per client for all clients:
python dps-log-cruncher.py -c \* access
split the output per back-end server for client 192.168.0.17:
python dps-log-cruncher.py -c 192.168.0.17 -s \* access
split the output for all clients, all servers:
python dps-log-cruncher.py -c \* -s \* access
only output results from 6pm (18:00) to 6:10pm (18:10):
python dps-log-cruncher.py -f 18:00 -t 18:10 access
output results between 6:00pm (18:00) to 6:10pm (18:10) and split results for all clients and back-end servers:
python dps-log-cruncher.py -f 18:00 -t 18:10 -c \* -s \* access
This is a list to manage expectations as much as it is one for me to remember to implement:
- Selectable time granularity resolution. Currently, all data is aggregated per minute. In some case, it would be useful to be able to see what happens per second
- Improve error handling for parameters on the CLI.
- Add a built-in graphing capability to avoid having to resort to using a spreadsheet. Spreadsheets do however give a lot of flexibility
- Add the ability to filter / split results per bind DN
- Output the response time distribution
Best effort is how I will label it for now, you can send your questions and support requests to arnaud -- at -- sun -- dot -- com.