tpry , procfs based thread monitoring SEToolkit style
By User12610965-Oracle on Oct 08, 2009
tpry, Version 4.0
First there was the SE Toolkit(c like interpreter) , Thanks Rich Pettit.
Then there was pea.se(Process Event Analyzer) , Thanks Adrian Cockcroft.
Then the se process_class.se was extended to lwps or threads, that would be me, Rick Weisner.
tpry.se was written to exploit the new capabilities of process_class.se.
Now tpry is available in C. Tpry calulates prstat like metrics for selected PIDs in the system, displaying the data at the thread level.
Why not use prstat -mL ?
Sometimes I need more decimal places than prstat, tpry gives me more.
I like having rates per sec, not counts as you sometimes get with prstat.
I like being able to monitor by execname.
I sometimes want the heap.
I sometimes want numbers of page faults, not just time spent.
I want the resource consumed by defunct threads
I like 100% to equal 100% of a single hardware thread
Sometimes I need a subsecond sampling rate.
And tpry was developed before prstat.
tpry is invoked:
tpry pid interval [count] [execnames]
tpry -1 interval [count] [execnames]
tpry 0 interval [count] [pid list]
If the pid == -1 then all pids are monitored that match the list of execnames
If the pid == 0 then all pids are monitored that match the list of process ids.
If you are going to specify execnames or a pid list then the count must be specified.
If the count = 0 then tpry will monitor until \^C.
The interval is in seconds or fractions of a second.
Here is a sample output.
You will need a wide screen
arwen: ./tpry -1 1 5 perfbar
15:38:26 _name ____nlp ___pid __ppid __uid _usr%% _sys%% waitpf wlocks wcpu%% SLP%%% chld%% sz__KB heap_KB mjpf mnpf _inblk outblk chario __sysc __vctx __ictx __msps _%TRAP _priority/State
perfbar 2 1062 1 9775 0.01 0.02 0.00 0.00 0.01 99.96 0.00 4956 888 0.0 0.0 0.00 0.00 1892 54 8.92 0.00 0.04 0.00 59/S perfbar
0 0.00 0.00
0.00 0.00 0.00 0.00 0.0 0.0
0.00 0.00 0 0 0.00 0.00 0.00 0/?
1 0.01 0.02 0.00 0.00 0.01 99.96 0.0 0.0 0.00 0.00 1892 54 8.92 0.00 0.00 59/S
15:38:27 _name ____nlp ___pid __ppid __uid _usr%% _sys%% waitpf wlocks wcpu%% SLP%%% chld%% sz__KB heap_KB mjpf mnpf _inblk outblk chario __sysc __vctx __ictx __msps _%TRAP _priority/State
perfbar 2 1062 1 9775 0.01 0.02 0.00 0.00 0.01 99.96 0.00 4956 888 0.0 0.0 0.00 0.00 1889 54 8.91 1.98 0.03 0.00 59/S perfbar
0 0.00 0.00 0.00 0.00 0.00 0.00 0.0 0.0 0.00 0.00 0 0 0.00 0.00 0.00 0/?
1 0.01 0.02 0.00 0.00 0.01 99.96 0.0 0.0 0.00 0.00 1889 54 8.91 1.98 0.00 59/S
name execution short name, IE the last component of the pathname
nlp number of lwps in the process
pid Process ID
ppid Parent process id
uid User id
usr%% %CPU spent on USER
sys%% %CPU spent in kernel(includes traps and kernel page faults)
waitpf %CPU spent waiting on pagefaults
wlocks %CPU spent waiting on locks
wcpu%% %CPU waiting to run when runnable
SLP%%% %CPU waiting on a sleep queue
chld%% %CPU consumed by children
sz__KB %virtual size of process
heap_KB size of the heap
mjpf number of major pagefaults per sec
mnpf number of minor pagefaults per sec
inblk number of input blocks received per sec
outblk number of output blocks sent per sec
chario amount of character io in characters per sec
sysc number of system calls be second
vctx number of voluntary context switches per second
ictx number of involuntary context switches per second
msps milli seconds per switch
%TRAP %CPU in traps into kernel (included in sys%% above).
priority/State runtine priority and state
When monitoring at 1 sec or less intervals I may rarely see an overly large spike. I have tried several algorithms to handle unsigned long wrap around, but the issue persists. This may be a bug in procfs but I have not identified the bug.
Send all bugs RFEs to firstname.lastname@example.org