Performance Co-pilot (PCP) serves as a framework for comprehensive system performance analysis. It facilitates the continuous collection of system performance metrics and leverages a logging infrastructure for data archival. Additionally, it provides a suite of utilities for convenient viewing of system performance data, whether in real-time or from an archive, presented in an easily digestible format. Oracle has introduced a set of new utilities aimed at assisting system administrators in visualizing system performance metrics, following a format akin to existing Linux utilities. This article will introduce and elucidate these newly implemented tools: pcp-ps, pcp-buddyinfo, pcp-zoneinfo, pcp-slabinfo, pcp-meminfo, and pcp-netstat.
On Linux, /proc is a pseudo file system that provides information on the running Linux system. By examining /proc, one can gather information about the running kernel and the processes running on the system. PCP provides metrics which are primarily derived from the /proc filesystem. Traditionally, Linux administrators and users who want to analyze any system performance issues after they occur, refer to pseudo files presented by /proc; examples being /proc/meminfo (for memory usage), /proc/buddyinfo (for memory fragmentation issues), /proc/zoneinfo (information about memory zones), /proc/slabinfo (information about slab memory). These are collected as multiple metrics, and therefore viewing them together is not trivial. It takes time and effort to collate the relevant PCP metrics to get similar data presented together. The tools that we are introducing in this blog, present the PCP metrics in a format similar to the above pseudo files provided by /proc. Also, netstat
utility prints information about the Linux networking subsystem. The other tool pcp-netstat we cover in this blog, presents the relevant metrics to examine network subsystems, in a format similar to the Linux utility netstat. These tools aim to greatly shorten the learning curve for administrators/support teams when they examine system performance information using PCP.
The pcp-ps tool provides users with crucial insights into the behaviour and performance of processes. This includes essential details such as Process ID (PID), associated terminal (TTY), accumulated CPU time, and the command name of the task. Users can narrow down their analysis by specifying options like -e to display all processes, or by filtering based on criteria such as command name (-c [command name]) or username (-U [username]). The tool offers the flexibility to define user-specific output formats using the -o option. This empowers users to selectively display columns like CPU utilization, memory usage, process states, and more. pcp-ps can be used to extract real-time data for the local host, providing instant insights. Additionally, when combined with PCP’s archive replay capabilities, it can analyze historical performance data. Users have the option to specify a custom timezone using the -Z option, ensuring that timestamps align with their preferred timezone.
Snapshot of the current processes metrics using pcp-ps on a live system
$ pcp ps | head -10 Linux 5.4.17-2136.323.6.el8uek.x86_64 (localhost.localdomain) 09/27/23 x86_64 (4 CPU) Timestamp PID TIME CMD 16:48:33 1 00:00:08 systemd 16:48:33 2 00:00:00 kthreadd 16:48:33 3 00:00:00 rcu_gp 16:48:33 4 00:00:00 rcu_par_gp 16:48:33 6 00:00:00 kworker/0:0H-events_ 16:48:33 8 00:00:03 kworker/0:1H-events_ 16:48:33 9 00:00:00 mm_percpu_wq 16:48:33 10 00:00:00 ksoftirqd/0
The pcp-ps with header selection which is similar to ps command option -o (User-defined format) where user can define output format .
In the example below we have selected process id (pid), process parent id(ppid), memory usage(%mem), user name(USER) and address of the kernel function where the process is sleeping(WCHAN).
$ pcp ps -o pid,ppid,%mem,uname,wchan | head -10 Linux 5.4.17-2136.323.6.el8uek.x86_64 (localhost.localdomain) 09/27/23 x86_64 (4 CPU) Timestamp PID PPID %MEM USER WCHAN 16:50:27 1 0 0.07 root ep_poll 16:50:27 2 0 0.0 root kthreadd 16:50:27 3 2 0.0 root rescuer_thread 16:50:27 4 2 0.0 root rescuer_thread 16:50:27 6 2 0.0 root worker_thread 16:50:27 8 2 0.0 root worker_thread 16:50:27 9 2 0.0 root rescuer_thread 16:50:27 10 2 0.0 root smpboot_thread_fn
The pcp-ps output with pre-defined user format option, by default gives all the important metrics related to process such as:
$ pcp ps -u | head -10 Linux 5.4.17-2136.323.6.el8uek.x86_64 (localhost.localdomain) 09/27/23 x86_64 (4 CPU) Timestamp USERNAME PID %CPU %MEM VSZ RSS TTY STAT TIME START COMMAND 16:52:43 root 1 0.0 0.0 175960 14700 ? S 00:00:08 13:12:06 systemd 16:52:43 root 2 0.0 0.0 0 0 ? S 00:00:00 13:12:06 kthreadd 16:52:43 root 3 0.0 0.0 0 0 ? I 00:00:00 13:12:06 rcu_gp 16:52:43 root 4 0.0 0.0 0 0 ? I 00:00:00 13:12:06 rcu_par_gp 16:52:43 root 6 0.0 0.0 0 0 ? I 00:00:00 13:12:06 kworker/0:0H-events_ 16:52:43 root 8 0.0 0.0 0 0 ? I 00:00:03 13:12:06 kworker/0:1H-events_ 16:52:43 root 9 0.0 0.0 0 0 ? I 00:00:00 13:12:06 mm_percpu_wq 16:52:43 root 10 0.0 0.0 0 0 ? S 00:00:00 13:12:06 ksoftirqd/0
#Select by process ID pcp -p pid_of_process #Select by parent process ID pcp -P ppid_of_process
The tool presents a detailed breakdown of available pages for different orders, ranging from 0 to 10. Each order represents a specific size category, enabling users to grasp memory availability at varying granularities. pcp-buddyinfo elevates the analysis by presenting data in a structured and accessible format. This empowers users to quickly discern critical patterns and trends in memory utilization without the need for manual interpretation of raw text data.
Buddyinfo related data using pcp-buddyinfo on a live system:
$ pcp buddyinfo Linux 5.4.17-2136.323.6.el8uek.x86_64 (localhost.localdomain) 09/27/23 x86_64 (4 CPU) TimeStamp Normal Nodes Order0 Order1 Order2 Order3 Order4 Order5 Order6 Order7 Order8 Order9 Order10 16:47:44 DMA node0 0 0 0 0 0 0 0 0 1 1 3 16:47:44 DMA32 node0 2 7 5 4 7 7 9 4 6 5 787 16:47:44 Normal node0 56 45 31 17 3 316 226 110 54 1 2914
Analyzing pcp-buddyinfo.0.xz archive with pcp-buddyinfo:
$ pcp -a pcp-buddyinfo.0.xz buddyinfo | head -10 Linux 5.4.17-2136.317.5.3.el8uek.x86_64 (localhost.localdomain) 08/02/23 x86_64 (4 CPU) TimeStamp Normal Nodes Order0 Order1 Order2 Order3 Order4 Order5 Order6 Order7 Order8 Order9 Order10 15:53:21 DMA node0 0 0 0 0 0 0 0 0 1 1 3 15:53:21 DMA32 node0 3 1 1 1 2 3 2 3 3 2 803 15:53:21 Normal node0 1720 4118 1862 671 220 120 66 29 21 26 3040 Linux 5.4.17-2136.317.5.3.el8uek.x86_64 (localhost.localdomain) 08/02/23 x86_64 (4 CPU) TimeStamp Normal Nodes Order0 Order1 Order2 Order3 Order4 Order5 Order6 Order7 Order8 Order9 Order10 15:53:24 DMA node0 0 0 0 0 0 0 0 0 1 1 3 15:53:24 DMA32 node0 3 1 1 1 2 3 2 3 3 2 803 15:53:24 Normal node0 1720 4118 1862 671 220 120 66 29 21 26 3040
The pcp-zoneinfo tool offers a detailed view of NUMA (Non-Uniform Memory Access) nodes and their associated statistics, extracted from the /proc/zoneinfo file. It enables users to analyze memory zone availability across nodes, which is vital for optimizing performance in modern server setups with NUMA architectures. By using pcp-zoneinfo, users can gain insights into memory allocation patterns, ensuring efficient utilization of available resources. This tool empowers users with the ability to filter samples from the archive and provides archive replay capabilities.
Live system metrics for zoneinfo using pcp-zoneinfo:
$ pcp zoneinfo | head -10 Linux 5.4.17-2136.323.6.el8uek.x86_64 (localhost.localdomain) 09/27/23 x86_64 (4 CPU) TimeStamp = 16:37:22 NODE 0, per-node status nr_inactive_anon 5614 nr_active_anon 469809 nr_inactive_file 279923 nr_active_file 246316 nr_unevictable 3097 nr_slab_reclaimable 107132 nr_slab_unreclaimable 43908
Analyzing pcp-zoneinfo.0.xz archive with pcp-zoneinfo:
$ pcp -a pcp-zoneinfo.0.xz zoneinfo Linux 5.4.17-2136.320.7.1.el7uek.x86_64 (sagar-vminstance-1) 10/03/23 x86_64 (4 CPU) TimeStamp = 11:49:31 Node 0, zone DMA per-node status nr_inactive_anon 33918 nr_active_anon 224024 nr_inactive_file 786762 nr_active_file 404375 nr_unevictable 5262 nr_slab_reclaimable 190078 nr_slab_unreclaimable 64022 nr_isolated_anon 0 nr_isolated_file 0 nr_anon_pages 166619 nr_mapped 49208 nr_file_pages 1280072 nr_dirty 18
The pcp-slabinfo tool offers an in-depth view of the kernel slab allocator’s statistics. It collates existing PCP metrics related to slab memory. This information is presented in a format reminiscent of the proc filesystem. Users can efficiently analyze memory object allocation in the kernel, providing a real-time perspective in live systems or recorded archive data. The tool displays the current count of active objects, allowing users to understand allocation status. Additionally, it provides the total count of allocated objects, whether in use or not.
Live system metrics for slabinfo using pcp-slabinfo tool:
$ pcp slabinfo | head -10 Linux 5.4.17-2136.323.6.el8uek.x86_64 (localhost.localdomain) 09/27/23 x86_64 (4 CPU) TimeStamp Name active_objs num_objs objsize byte objperslab pagesperslab active_slabs num_slabs 16:55:13 Acpi-Operand 4368 4368 72 56 1 78 78 16:55:13 Acpi-Parse 329400 329522 56 73 1 4514 4514 16:55:13 Acpi-State 765 765 80 51 1 15 15 16:55:13 anon_vma 11434 12714 104 39 1 326 326 16:55:13 anon_vma_chain 17182 20864 64 64 1 326 326 16:55:13 avc_xperms_data 19328 19328 32 128 1 151 151 16:55:13 avtab_extended_perms 290190 290190 40 102 1 2845 2845 16:55:13 avtab_node 401880 401880 24 170 1 2364 2364
Analyzing pcp-slabinfo.0.xz archive with pcp-slabinfo:
$ pcp -a pcp-slabinfo.0.xz slabinfo | head -10 Linux 5.4.17-2136.317.5.3.el8uek.x86_64 (localhost.localdomain) 08/02/23 x86_64 (4 CPU) TimeStamp Name active_objs num_objs objsize byte objperslab pagesperslab active_slabs num_slabs 10:25:20 Acpi-Operand 4592 4592 72 56 1 82 82 10:25:20 Acpi-Parse 314423 315652 56 73 1 4324 4324 10:25:20 Acpi-State 765 765 80 51 1 15 15 10:25:20 anon_vma 11481 11895 104 39 1 305 305 10:25:20 anon_vma_chain 18269 19264 64 64 1 301 301 10:25:20 avc_xperms_data 8192 8192 32 128 1 64 64 10:25:20 avtab_extended_perms 276216 276216 40 102 1 2708 2708 10:25:20 avtab_node 401880 401880 24 170 1 2364 2364
The pcp-meminfo tool offers a comprehensive report on memory usage within the system, utilizing data from the /proc/meminfo file in the /proc pseudo-file system. It provides valuable insights into various memory statistics like used and available memory, swap space, cache, and buffers. This tool aids in reviewing memory usage information. While not indispensable, it offers an additional method for examining memory data, whether on a live machine or from recorded archive data, contributing to effective troubleshooting.
To examine the memory usage statistics on a live machine, execute the subsequent command:
$ pcp meminfo | head -10 Linux 5.4.17-2136.300.7.el8uek.x86_64 (Mohit-OL8u5-vm1) 09/19/23 x86_64 (2 CPU) 05:50:16 MemTotal : 1734892 kB MemFree : 261832 kB MemAvailable : 908972 kB Buffers : 3164 kB Cached : 737992 kB SwapCached : 28 kB Active : 572576 kB Inactive : 551840 kB
To examine the memory usage statistics on archive data, execute the subsequent command
$ pcp meminfo -a pcp-meminfo.0.xz -s 2 Linux 5.4.17-2136.300.7.el8uek.x86_64 (Mohit-OL8u5-vm1) 09/08/23 x86_64 (2 CPU) 07:06:55 MemTotal : 1734892 kB MemFree : 244256 kB MemAvailable : 860364 kB Buffers : 172 kB Cached : 728404 kB SwapCached : 12876 kB Active : 402000 kB Inactive : 608084 kB Active(anon) : 143436 kB Inactive(anon) : 161768 kB Active(file) : 258564 kB Inactive(file) : 446316 kB Unevictable : 0 kB Mlocked : 0 kB SwapTotal : 1572860 kB SwapFree : 1403752 kB Dirty : 3604 kB Writeback : 0 kB AnonPages : 275972 kB Mapped : 71720 kB<<cropped>>
In general, netstat is a Linux tool that provides statistics about all active connections on a computer, including incoming and outgoing connections, routing tables, and network protocol statistics. For instance, you can use netstat to display all active TCP connections to the computer, display all active UDP connections to the computer, display the routing table of the computer, and display statistics for each protocol. For more info please refer to the netstat man page.
The pcp-netstat is a tool developed on the same lines to view different kinds of statistics related to network protocols and network interfaces. In particular, this tools collects netstat -s
and netstat -i -a -n
output. It is useful for checking the status of network interfaces, network connections, and troubleshooting network issues. This tool can also be used to analyze network statistics for all available protocols, including TCP, UDP, ICMP, and IP protocols.
By default when no flags are provided as input, this tools displays both the network protocol statistics and network interface statistics.
Execute the subsequent command to examine the network statistics on the live machine.
$ pcp netstat -s 1 Linux 5.4.17-2136.300.7.el8uek.x86_64 (Mohit-OL8u5-vm1) 09/19/23 x86_64 (2 CPU) 05:59:25 Ip: Forwarding: 2 161401 total packets received 3 with invalid addresses 0 forwarded 0 incoming packets discarded 125253 incoming packets delivered 122240 requests sent out 12 dropped because of missing route Icmp: 61 ICMP messages received 0 Input ICMP message failed ICMP input histogram: destination unreachable: 61 0 ICMP messages sent 0 ICMP messages failed ICMP input histogram: Output destination unreachable: 0 IcmpMsg: InType3: 61 OutType0: NA Tcp: 66 active connections openings 22 passive connection openings <<cropped>> Kernel Interface table Iface MTU RX-OK RX-ERR RX-DRP TX-OK TX-ERR TX-DRP ens2 1500 199173 0 0 44991 0 0 ens3 1500 23 0 0 5813 0 0 ens4 1500 5835 0 0 2 0 0 lo 65536 78641 0 0 78641 0 0
To examine the network statistics on the recorded archive data, execute the subsequent command.
$ pcp netstat -a pcp-netstat.0.xz Linux 5.4.17-2136.300.7.el8uek.x86_64 (Mohit-OL8u5-vm1) 09/08/23 x86_64 (2 CPU) 06:35:53 Ip: Forwarding: 2 8765322 total packets received 8 with invalid addresses 0 forwarded 0 incoming packets discarded 5386076 incoming packets delivered 5368638 requests sent out 12 dropped because of missing route Icmp: 3817 ICMP messages received 37 Input ICMP message failed ICMP input histogram: destination unreachable: 3817 17 ICMP messages sent 0 ICMP messages failed ICMP input histogram: Output destination unreachable: 17 IcmpMsg: InType3: 3817 OutType0: 17 Tcp: 2316 active connections openings 1017 passive connection openings 153 failed connection attempts <<cropped>> Kernel Interface table Iface MTU RX-OK RX-ERR RX-DRP TX-OK TX-ERR TX-DRP ens2 1500 14340172 0 165 565382 0 0 ens3 1500 4415935 0 0 5941623 0 0 ens4 1500 1669793 0 0 1 0 0 lo 65536 1034960 0 0 1034960 0 0
For protocol-specific statistics, make use of the -p option followed the protocol name [TCP|IP|UDP|ICMP
]
$ pcp netstat -s 1 -p IP Linux 5.4.17-2136.300.7.el8uek.x86_64 (Mohit-OL8u5-vm1) 09/27/23 x86_64 (2 CPU) 05:36:53 Ip: Forwarding: 2 1435031 total packets received 2 with invalid addresses 0 forwarded 0 incoming packets discarded 390677 incoming packets delivered 386767 requests sent out 12 dropped because of missing route IpExt: InMcastPkts: 6916 InBcastPkts: 24936 InOctets: 889985068 OutOctets: 83609066 InMcastOctets: 221312 InBcastOctets: 5704410 InNoECTpkts: 1451089
To obtain statistics pertaining to network interfaces, utilize the -i
option.
$ pcp netstat -s 1 -i Linux 5.4.17-2136.300.7.el8uek.x86_64 (Mohit-OL8u5-vm1) 09/27/23 x86_64 (2 CPU) 05:39:06 Kernel Interface table Iface MTU RX-OK RX-ERR RX-DRP TX-OK TX-ERR TX-DRP ens2 1500 4715661 0 0 197130 0 0 ens3 1500 253452 0 0 244360 0 0 ens4 1500 497809 0 0 1 0 0 lo 65536 431528 0 0 431528 0 0
# shows output similar to netstat -s, which displays network statistics for each protocol pcp netstat -s 1 --statistics # protocol specific stats pcp netstat -s 1 TCP pcp netstat -s 1 UDP pcp netstat -s 1 ICMP
This blog introduced new PCP tools pcp-ps, pcp-buddyinfo, pcp-meminfo, pcp-netstat, and pcp-slabinfo to view PCP metrics in an already familar format. For more details and background on PCP, you can refer these blogs :
Next Post