Tuesday Mar 01, 2011

Virtual Network - Part 4

Resource Controls

This is the fourth part of a series of blog entries about Solaris network virtualization. Part 1 introduced network virtualization, Part 2 discussed network resource management capabilities available in Solaris 11 Express, and Part 3 demonstrated the use of virtual NICs and virtual switches.

This entry shows the use of a bandwidth cap on Virtual Network Elements (VNEs). This form of network resource control can effectively limit the amount of bandwidth consumed by a particular stream of packets. In our context, we will restrict the amount of bandwidth that a zone can use.

As a reminder, we have the following network topology, with three zones and three VNICs, one VNIC per zone.

All three VNICs were created on one ethernet interface in Part 3 of this series.

Capping VNIC Bandwidth

Using a T2000 server in a lab environment, we can measure network throughput with the new dlstat(1) command. This command reports various statistics about data links, including the quantity of packets, bytes, interrupts, polls, drops, blocks, and other data. Because I am trying to illustrate the use of commands, not optimize performance, the network workload will be a simple file transfer using ftp(1). This method of measuring network bandwidth is reasonable for this purpose, but says nothing about the performance of this platform. For example, this method reads data from a disk. Some of that data may be cached, but disk performance may impact the network bandwidth measured here. However, we can still achieve the basic goal: demonstrating the effectiveness of a bandwidth cap.

With that background out of the way, first let's check the current status of our links.

GZ# dladm show-link
LINK        CLASS     MTU    STATE    BRIDGE     OVER
e1000g0     phys      1500   up       --         --
e1000g2     phys      1500   unknown  --         --
e1000g1     phys      1500   down     --         --
e1000g3     phys      1500   unknown  --         --
emp_web1    vnic      1500   up       --         e1000g0
emp_app1    vnic      1500   up       --         e1000g0
emp_db1     vnic      1500   up       --         e1000g0
GZ# dladm show-linkprop emp_app1
LINK         PROPERTY        PERM VALUE          DEFAULT        POSSIBLE
emp_app1     autopush        rw   --             --             --
emp_app1     zone            rw   emp-app        --             --
emp_app1     state           r-   unknown        up             up,down
emp_app1     mtu             rw   1500           1500           1500
emp_app1     maxbw           rw   --             --             --
emp_app1     cpus            rw   --             --             --
emp_app1     cpus-effective  r-   1-9            --             --
emp_app1     pool            rw   SUNWtmp_emp-app --             --
emp_app1     pool-effective  r-   SUNWtmp_emp-app --             --
emp_app1     priority        rw   high           high           low,medium,high
emp_app1     tagmode         rw   vlanonly       vlanonly       normal,vlanonly
emp_app1     protection      rw   --             --             mac-nospoof,
                                                                restricted,
                                                                ip-nospoof,
                                                                dhcp-nospoof
<some lines deleted>
Before setting any bandwidth caps, let's determine the transfer rates between a zone on this system and a remote system.

It's easy to use dlstat to determine the data rate to my home system while transferring a file from a zone:

GZ# dlstat -i 10 e1000g0 
           LINK    IPKTS   RBYTES    OPKTS   OBYTES
       emp_app1   27.99M    2.11G   54.18M   77.34G
       emp_app1       83    6.72K        0        0
       emp_app1      339   23.73K    1.36K    1.68M
       emp_app1    1.79K  120.09K    6.78K    8.38M
       emp_app1    2.27K  153.60K    8.49K   10.50M
       emp_app1    2.35K  156.27K    8.88K   10.98M
       emp_app1    2.65K  182.81K    5.09K    6.30M
       emp_app1      600   44.10K      935    1.15M
       emp_app1      112    8.43K        0        0
The OBYTES column is simply the number of bytes transferred during that data sample. I'll ignore the 1.68MB and 1.15MB data points because the file transfer began and ended during those samples. The average of the other values leads to a bandwidth of 7.6 Mbps (megabits per second), which is typical for my broadband connection.

Let's pretend that we want to constrain the bandwidth consumed by that workload to 2 Mbps. Perhaps we want to leave all of the rest for a higher-priority workload. Perhaps we're an ISP and charge for different levels of available bandwidth. Regardless of the situation, capping bandwidth is easy:

GZ# dladm set-linkprop -p maxbw=2000k emp_app1
GZ# dladm show-linkprop -p maxbw emp__app1
LINK         PROPERTY        PERM VALUE          DEFAULT        POSSIBLE
emp_app1     maxbw           rw       2          --             --
GZ# dlstat -i 20 emp_app1 
           LINK    IPKTS   RBYTES    OPKTS   OBYTES
       emp_app1   18.21M    1.43G   10.22M   14.56G
       emp_app1      186   13.98K        0        0
       emp_app1      613   51.98K    1.09K    1.34M
       emp_app1    1.51K  107.85K    3.94K    4.87M
       emp_app1    1.88K  131.19K    3.12K    3.86M
       emp_app1    2.07K  143.17K    3.65K    4.51M
       emp_app1    1.84K  136.03K    3.03K    3.75M
       emp_app1    2.10K  145.69K    3.70K    4.57M
       emp_app1    2.24K  154.95K    3.89K    4.81M
       emp_app1    2.43K  166.01K    4.33K    5.35M
       emp_app1    2.48K  168.63K    4.29K    5.30M
       emp_app1    2.36K  164.55K    4.32K    5.34M
       emp_app1      519   42.91K      643  793.01K
       emp_app1      200   18.59K        0        0
Note that for dladm, the default unit for maxbw is Mbps. The average of the full samples is 1.97 Mbps.

Between zones, the uncapped data rate is higher:

GZ# dladm reset-linkprop -p maxbw emp_app1
GZ# dladm show-linkprop  -p maxbw emp_app1
LINK         PROPERTY        PERM VALUE          DEFAULT        POSSIBLE
emp_app1     maxbw           rw   --             --             --
GZ# dlstat -i 20 emp_app1
           LINK    IPKTS   RBYTES    OPKTS   OBYTES
       emp_app1   20.80M    1.62G   23.36M   33.25G
       emp_app1      208   16.59K        0        0
       emp_app1   24.48K    1.63M  193.94K  277.50M
       emp_app1  265.68K   17.54M    2.05M    2.93G
       emp_app1  266.87K   17.62M    2.06M    2.94G
       emp_app1  255.78K   16.88M    1.98M    2.83G
       emp_app1  206.20K   13.62M    1.34M    1.92G
       emp_app1   18.87K    1.25M   79.81K  114.23M
       emp_app1      246   17.08K        0        0
This five year old T2000 can move at least 1.2 Gbps of data, internally, but that took five simultaneous ftp sessions. (A better measurement method, one that doesn't include the limits of disk drives, would yield better results, and newer systems, either x86 or SPARC, have higher internal bandwidth characteristics.) In any case, the maximum data rate is not interesting for our purpose, which is demonstration of the ability to cap that rate.

You can often resolve a network bottleneck while maintaining workload isolation, by moving two separate workloads onto the same system, within separate zones. However, you might choose to limit their bandwidth consumption. Fortunately, the NV tools in Solaris 11 Express enable you to accomplish that:

GZ# dladm set-linkprop -t -p maxbw=25m emp_app1
GZ# dladm show-linkprop -p maxbw emp_app1
LINK         PROPERTY        PERM VALUE          DEFAULT        POSSIBLE
emp_app1     maxbw           rw      25          --             --
Note that the change to the bandwidth cap was made while the zone was running, potentially while network traffic was flowing. Also, changes made by dladm are persistent across reboots of Solaris unless you specify a "-t" on command line.

Data moves much more slowly now:

GZ# # dlstat  -i 20 emp_app1
           LINK    IPKTS   RBYTES    OPKTS   OBYTES
       emp_app1   23.84M    1.82G   46.44M   66.28G
       emp_app1      192   16.10K        0        0
       emp_app1    1.15K   79.21K    5.77K    8.24M
       emp_app1   18.16K    1.20M   40.24K   57.60M
       emp_app1   17.99K    1.20M   39.46K   56.48M
       emp_app1   17.85K    1.19M   39.11K   55.97M
       emp_app1   17.39K    1.15M   38.16K   54.62M
       emp_app1   18.02K    1.19M   39.53K   56.58M
       emp_app1   18.66K    1.24M   39.60K   56.68M
       emp_app1   18.56K    1.23M   39.24K   56.17M
<many lines deleted>
The data show an aggregate bandwidth of 24 Mbps.

Conclusion

The network virtualization tools in Solaris 11 Express include various resource controls. The simplest of these is the bandwidth cap, which you can use to effectively limit the amount of bandwidth that a workload can consume. Both physical NICs and virtual NICs may be capped by using this simple method. This also applies to workloads that are in Solaris Zones - both default zones and Solaris 10 Zones which mimic Solaris 10 systems.

Next time we'll explore some other virtual network architectures.

Thursday Jan 27, 2011

Virtual Networks - Part 2

This is the second in a series of blog entries that discuss the network virtualization features in Solaris 11 Express. The first entry discussed the basic concepts and the virtual network elements, including virtual NICs, VLANs, virtual switches, and InfiniBand datalinks.

This entry adds to that list the resource controls and security features that are necessary for a well-managed virtual network.

Virtual Networks, Real Resource Controls

In Oracle Solaris 11 Express, there are four main datalink resource controls:
  1. a bandwidth cap, which limits the amount of traffic passing through a datalink in a small amount of elapsed time
  2. assignment of packet processing tasks to a subset of the system's CPUs
  3. flows, which were introduced in the previous blog post
  4. rings, which are hardware or software resources that can be dedicated to a single purpose.
Let's take them one at a time. By default, datalinks such as VNICs can consume as much of the physical NIC's bandwidth as they want. That might be the desired behavior, but if it isn't you can apply the property "maxbw" to a datalink. The maximum permitted bandwidth can be specified in Kbps, Mbps or Gbps. This value can be changed dynamically, so if you set this value too low, you can change without affecting the traffic flowing over that link. Solaris will not allow traffic to flow over that datalink at a rate faster than you specify.

You can "over-subscribe" this bandwidth cap: the sum of the bandwidth caps on the VNICs assigned to a NIC can exceed the rated bandwidth of the NIC. If that happens, the bandwidth caps become less effective.

In addition the bandwidth cap, packet processing computation can be constrained to the CPUs associated with a workload.

First some background. When Solaris boots, it assigns interrupt handler threads to the CPUs in the system. (See Solaris CPUs for an explanation of the meaning of "CPU".) Solaris attempts to spread the interrupt handlers out evenly so that one CPU does not become a bottleneck for interrupt handling.

If you create non-default CPU pools, the interrupt handlers will retain their CPU assignments. One unintended side effect of this is a situation where the CPUs intended for one workload will be handling interrupts caused by another workload. This can occur even with simple configurations of Solaris Zones. In extreme cases, network packet processing for one zone can severely impact the performance of another zone.

To prevent this behavior, Solaris 11 Express offers the ability to assign a datalink's interrupt handler to a set of CPUs or a pool of CPUs. To simplify this further, the obvious choice is made for you, by default, for a zone which is assigned its own resource pool. When such a zone boots, a resource pool is created for the zone, a sufficient quantity of CPUs is moved from the default pool to the zone's pool, and interrupt handlers for that zone's datalink(s) are automatically reassigned to that resource pool. Network flows enable you to create multiple lanes of traffic. This allows the parallelization of network traffic. You can assign a bandwidth cap to a flow. Flows were introduced in the previous post and will be discussed further in future posts.

Finally, the newest high speed NICs support hardware rings: memory resources that can be dedicated to a particular set of network traffic. For inbound packets, this is the first resource control that separates network traffic based on packet information such as destination MAC address. By assigning one or more rings to a stream of traffic, you can commit sufficient hardware resources to it and ensure a greater relative priority for those packets, even if another stream of traffic on the same NIC would otherwise cause congestion and impact packet latency of all streams.

If you are using a NIC that does not support hardware rings, Solaris 11 Express support software rings which cause a similar effect.

Virtual Networks, Real Security

In addition to rescource controls, Solaris 11 Express offers datalink protection controls. These controls are intended to prevent a user from creating improper packets that would cause mischief on the network. The mac-nospoof property requires that outgoing packets have a MAC address which matches the link's MAC address. The ip-nospoof property implements a similar restriction, but for IP addresses. The dhcp-nospoof property prevents improper DHCP assignment.

Summary (so far)

The network virtualization features in Solaris 11 Express enable the creation of virtual network devices, leading to the implementation of an entire network inside one Solaris system. Associated resource control features give you the ability to manage network bandwidth as a resource and reduce the potential for one workload to cause network performance problems for another workload. Finally, security features help you minimize the impact of an intruder.

With all of the introduction out of the way, next time I'll show some actual uses of these concepts.

Wednesday Jan 05, 2011

Virtual Networks

Network virtualization is one of the industry's hot topics. The potential to reduce cost while increasing network flexibility easily justifies the investment in time to understand the possibilities. This blog entry describes network virtualization and some concepts. Future entries will show the steps to create a virtual network.

Introduction to Network Virtualization

Network virtualization can be described as the process of creating a computer network which does not match the physical topology of a physical network. Usually this is achieved by using software tools of general-purpose computers or by using features of network hardware. A defining characteristic of a virtual network is the ability to re-configure the topology without manipulating any physical objects: devices or cables.

Such a virtual network mimics a physical network. Some types of virtual networks, for example virtual LANs (VLANs), can be implemented using features of network switches and computers. However, some other implementations do not require traditional network hardware such as routers and switches. All of the functionality of network hardware has been re-implemented in software, perhaps in the operating system.

Benefits of network virtualization (NV) include increased architectural flexibility, better bandwidth and latency characteristics, the ability to prioritize network traffic to meet desired performance goals, and lower cost from fewer devices, reduced total power consumption, etc.

The remainder of this blog entry will focus on a software-only implementation of NV.

A few years ago, networking engineers at Sun began working on a software project named "Crossbow." The goal was to create a comprehensive set of NV features within Solaris. Just like Solaris Zones, Crossbow would provide integrated features for creation and monitoring of general purpose virtual network elements that could be deployed in limitless configurations. Because these features are integrated into the operating system, they automatically take advantage of - and smoothly interoperate with - existing features. This is most noticeable in the integration of Solaris NV features and Solaris Zones. Also, because these NV features are a part of Solaris, future Solaris enhancements will be integrated with Solaris NV where appropriate.

The core NV features were first released in OpenSolaris 2009.06. Since then, those core features have matured and more details have been added. The result is the ability to re-implement entire networks as virtual networks using Solaris 11 Express. Here is an example of a virtual network architecture:

As you can guess from that example, you can create virtually :-) any network topology as a virtual network...

Oracle Solaris NV does more than is described here. This content focuses on the key features which might be used to consolidate workloads or entire networks into a Solaris system, using zones and NV features.

Virtual Network Elements

Solaris 11 Express implements the following virtual network elements.
  • NIC: OK, this isn't a virtual element, it's just on the list as a starting point.
    For a very long time, Solaris has managed Network Interface Connectors (NICs). Solaris offers tools to manage NICs, including bringing them up and down, and assigning various characteristics to them, such as IP addresses, assignment to IP Multipathing (IPMP) groups, etc. Note that up through Solaris 10, most of those configuration tasks were accomplished with the ifconfig(1M) command, but in Solaris 11 Express the dladm(1M) and ipadm(1M) commands perform those tasks, and a few more. You can monitor the use of NICs with dlstat(1M). The term "datalink" is now used consistently to refer to NICs and things like NICs, such as...

  • A VNIC is a pseudo interface created on a datalink (a NIC or an etherstub, described next). Each VNIC has its own MAC address, which can be generated automatically, but can be specified manually. For almost all purposes, a VNIC can be can be managed like a NIC. The dladm command creates, lists, deletes, and modifies VNICs. The dlstat command displays statistics about VNICs. The ipadm(1M) command configures IP interfaces on VNICs.
    Like NICs, VNICs have a number of properties that can be modified with dladm. These include the ability to force network processing of a VNIC to a certain set of CPUs, setting a cap (maximum) on permitted bandwidth for a VNIC, the relative priority of this VNIC versus other VNICs on the same NIC, and other properties.

  • Etherstubs are pseudo NICs, making internal networks possible. For a general understanding, think of them as virtual switches. The command dladm manages etherstubs.

  • A flow is a stream of packets that share particular attributes such as source IP address or TCP port number. Once defined, a flow can be managed as an entity, including capping bandwidth usage, setting relative priorities, etc. The new flowadm(1M) command enables you to create and manage flows. Even if you don't set resource controls, flows will benefit from dedicated kernel resources and more predictable, consistent performance. Further, you can directly observe detailed statistics on each flow, improving your ability to understand these streams of packets and set proper resource controls. Flows are managed with flowadm(1M) and monitored with flowstat(1M).

  • VLANs (Virtual LANs) have been around for a long time. For consistency, the commands dladm, dlstat and ipadm now manage VLANs.

  • InfiniBand partitions are virtual networks that use an InfiniBand fabric. They are managed with the same commands as VNICs and VLANs: dladm, dlstat, ipadm and others.

Summary

Solaris 11 Express provides a complete set of virtual network components which can be used to deploy virtual networks within a Solaris instance. The next blog entry will describe network resource management and security. Future entries will provide some examples.

Thursday Dec 09, 2010

All New Zonestat - Part 2

Part 2

Recently I introduced zonestat(1), a new command offered in Solaris 11 Express that replaces the zonestat Perl script that I had open-sourced a couple of years ago. Today I will complete the description of the new zonestat.

Fair Share Scheduler

One of the many useful resource controls offered by Solaris is the Fair Share Scheduler (FSS(7)). You can use FSS to tell Solaris "make sure that zoneXYZ can use a specific portion of the compute capacity of a set of 'Solaris CPUs' at a minimum. (For more information on FSS, see its man page and the Solaris 10 or Solaris 11 Express documentation.) FSS only enforces those minima if there is CPU contention.

That terse explanation used the phrase "of a set of CPUs" because the minimum portions of compute capacity enforced by Solaris can be calculated across all of the CPUs in the system, or across a set of CPUs that you have configured into a processor set. For my purpose here, the point is that you can create a resource pool - including a processor set, assign multiple zones to that pool, and tell Solaris to use the FSS scheduling algorithm for the processes to be scheduled on those CPUs. (See the libpool(3LIB) man page for more information.)

Zonestat will show information relating to FSS, including data that answers the question "is there CPU contention in any of my psets, and if there is, which zone(s) are using more than I expected?"

The next example uses two new zones, and assumes that most of the zones used earlier have been turned off for now. A dynamic resource pool, sharedDB, has been created. It will be the set of CPUs used by two zones which run database software. (This method is called "capped Containers" in Oracle licensing documents and is considered hard partitioning. It can be used to limit software license costs.) One of those zones, zoneDB-2, is more important than the other, zoneDB-1. To meet its SLA, zoneDB-2 must always be able to use at least 4 of the 6 CPUs in that pset.

# zonestat -r psets 10 1
Collecting data for first interval...
Interval: 1, Duration: 0:00:10
PROCESSOR_SET                   TYPE  ONLINE/CPUS     MIN/MAX
pset_default            default-pset        22/22         1/-
                                ZONE  USED   PCT   CAP  %CAP   SHRS  %SHR %SHRU
                             [total]  0.12 0.56%     -     -      -     -     -
                            [system]  0.02 0.12%     -     -      -     -     -
                              global  0.09 0.44%     -     -      -     -     -

PROCESSOR_SET                   TYPE  ONLINE/CPUS     MIN/MAX
sharedDB                   pool-pset          6/6         6/6
                                ZONE  USED   PCT   CAP  %CAP   SHRS  %SHR %SHRU
                             [total]  4.01 66.8%     -     -    300     -     -
                            [system]  0.00 0.00%     -     -      -     -     -
                            zoneDB-2  3.00 50.1%     -     -    200 66.6% 75.1%
                            zoneDB-1  1.00 16.7%     -     -    100 33.3% 50.2%

PROCESSOR_SET                   TYPE  ONLINE/CPUS     MIN/MAX
zoneA                  dedicated-cpu          4/4         4/4
                                ZONE  USED   PCT   CAP  %CAP   SHRS  %SHR %SHRU
                             [total]  0.00 0.09%     -     -      -     -     -
                            [system]  0.00 0.00%     -     -      -     -     -
                               zoneA  0.00 0.09%     -     -      -     -     -
In the output above, zoneDB-1 is using the equivalent processing capacity of 1 Solaris CPU, and zoneDB-2 is using the equivalent of 3 Solaris CPUs. The PCT column indicates that zoneDB-2 is using 50% (3 out of 6) of the CPU capacity of the entire pset.

In addition, the SHRS column shows the number of FSS shares assigned to each of those zones, and %SHR is that zone's proportion of the total number of shares. In other words, zoneDB-2 is using 50% of the pset, but hasn't even used its enforced minimum of 66.6%.

This FSS configuration ensures that 200/300ths of 6 CPUs (i.e. 4 CPUs) are available to zoneDB-2. The %SHRU value of 75% tells us that the zone is using 75% of those 4 CPUs. Again, each of those two zones is allowed to use more than its share, as long as each zone can use its specified minimum.

Sorting the Output

You may have noticed in earlier examples that the zones were not listed in alphabetical order. By default, they are sorted by the amount of the resource that zonestat was reporting. If a resource is not specified, the output is sorted on CPU%. The two rows showing total and system data are always listed first.

You can change the sort order with the -S option, for example:

GZ$ zonestat -r processes -S used 2 1
Collecting data for first interval...
Interval: 1, Duration: 0:00:02
PROCESSES                     SYSTEM LIMIT
system-limit                          292K
                                ZONE  USED   PCT   CAP  %CAP
                             [total]   110 0.36%     -     -
                            [system]     0 0.00%     -     -
                              global    61 0.20%     -     -
                               zoneD    26 0.08%     -     -
                               zoneA    23 0.07%     -     -

As the man page for zonestat(1) shows, many other resources can be monitored. I will not review each of them here.

Aggregated Data

But wait! There's more! ;-)

In addition to understanding resource usage over a short interval (e.g. 10 or 60 seconds) it is often to necessary to understand the peak usage or average usage over a longer period of time. For that purpose, zonestat provides Summary Reports.

A simple summary report is one that is appended to the per-sample data shown earlier. Here is the output for two samples and one summary report:

GZ$ zonestat -r processes -R high -S used 10 2
Collecting data for first interval...
Interval: 1, Duration: 0:00:10
PROCESSES                     SYSTEM LIMIT
system-limit                          292K
                                ZONE  USED   PCT   CAP  %CAP
                             [total]   109 0.36%     -     -
                            [system]     0 0.00%     -     -
                              global    61 0.20%     -     -
                               zoneD    25 0.08%     -     -
                               zoneA    23 0.07%     -     -

Interval: 2, Duration: 0:00:20
PROCESSES                     SYSTEM LIMIT
system-limit                          292K
                                ZONE  USED   PCT   CAP  %CAP
                             [total]   109 0.36%     -     -
                            [system]     0 0.00%     -     -
                              global    61 0.20%     -     -
                               zoneD    25 0.08%     -     -
                               zoneA    23 0.07%     -     -

Report: High Usage
    Start: Thu Dec  2 21:58:54 EST 2010
      End: Thu Dec  2 21:59:14 EST 2010
    Intervals: 2, Duration: 0:00:20
PROCESSES                     SYSTEM LIMIT
system-limit                          292K
                                ZONE  USED   PCT   CAP  %CAP
                             [total]   109 0.36%     -     -
                            [system]     0 0.00%     -     -
                              global    61 0.20%     -     -
                               zoneD    25 0.08%     -     -
                               zoneA    23 0.07%     -     -

If you only need the summary report, -q will be useful: it suppresses the individual samples of data.
GZ$ zonestat -q -r physical-memory -R high -S used 10 2
Report: High Usage
    Start: Thu Dec  2 22:03:54 EST 2010
      End: Thu Dec  2 22:04:14 EST 2010
    Intervals: 2, Duration: 0:00:20
PHYSICAL-MEMORY              SYSTEM MEMORY
mem_default                          31.8G
                                ZONE  USED   PCT   CAP  %CAP
                             [total] 5205M 15.9%     -     -
                            [system] 2790M 8.54%     -     -
                               zoneD 2229M 6.83%     -     -
                              global  141M 0.43%     -     -
                               zoneA 44.5M 0.13%     -     -
Let's assume that, from the data above, we determine that zoneD will never need more than 3 GB of RAM when it is operating correctly. We can add a RAM cap so that Solaris enforces that limit in case something goes awry:
GZ$ pfexec rcapadm -z zoneD -m 3g
GZ$ zonestat -q -r physical-memory -R high -S used 10 2
Report: High Usage
    Start: Thu Dec  2 22:37:15 EST 2010
      End: Thu Dec  2 22:37:35 EST 2010
    Intervals: 2, Duration: 0:00:20
PHYSICAL-MEMORY              SYSTEM MEMORY
mem_default                          31.8G
                                ZONE  USED   PCT   CAP  %CAP
                             [total] 5204M 15.9%     -     -
                            [system] 2789M 8.54%     -     -
                               zoneD 2229M 6.83% 3072M 72.5%
                              global  141M 0.43%     -     -
                               zoneA 44.5M 0.13%     -     -
In addition to a summary report at the end of the output, zonestat is able to generate periodic summary reports based on the collection of individual samples. For example, you might want to know what the peak memory usage is for each zone, per hour.

(A quick side note: zonestat does not perform continuous data collection. It collects data at an interval you specify. Therefore, the peak values reported by zonestat are the peak values of the values which were collected. In other words, zonestat reports the peak observed values.)

The example below collects data every 10 seconds for 24 hours. It reports the peak observed values every hour.

GZ$ zonestat -q -r physical-memory -R high 10 24h 60m
Report: High Usage
    Start: Mon Dec  5 16:42:01 EST 2010
      End: Mon Dec  5 17:42:01 EST 2010
    Intervals: 360, Duration: 1:00:00
PHYSICAL-MEMORY              SYSTEM MEMORY
mem_default                          31.8G
                                ZONE  USED   PCT   CAP  %CAP
                             [total] 3015M 9.23%     -     -
                            [system] 2791M 8.55%     -     -
                              global  136M 0.41%     -     -
                               zoneA 44.5M 0.13%     -     -
                               zoneD 45.7M 0.13% 3072M 1.48%

Report: High Usage
    Start: Mon Dec  5 17:42:01 EST 2010
      End: Mon Dec  5 18:42:01 EST 2010
    Intervals: 10, Duration: 1:00:00
PHYSICAL-MEMORY              SYSTEM MEMORY
mem_default                          31.8G
                                ZONE  USED   PCT   CAP  %CAP
                             [total] 3015M 9.23%     -     -
                            [system] 2791M 8.55%     -     -
                              global  136M 0.41%     -     -
                               zoneA 44.5M 0.13%     -     -
                               zoneD 64.3M 0.19% 3072M 2.09%

Report: High Usage
    Start: Mon Dec  5 18:42:01 EST 2010
      End: Mon Dec  5 19:42:01 EST 2010
    Intervals: 15, Duration: 1:00:00
PHYSICAL-MEMORY              SYSTEM MEMORY
mem_default                          31.8G
                                ZONE  USED   PCT   CAP  %CAP
                             [total] 3015M 9.23%     -     -
                            [system] 2791M 8.55%     -     -
                              global  136M 0.41%     -     -
                               zoneA 44.5M 0.13%     -     -
                               zoneD 65.1M 0.19% 3072M 2.11%
[further output deleted]

Parseable Data

At this point (especially after reading the sub-title of this section...) you might think that zonestat should have an option to generate output which is easy to parse. And you won't be disappointed: CSV (colon-separated-value) output is the result of the lower-case -p option:
GZ$ zonestat -p -q -r physical-memory -R high -S used 10 2
report-high:header:20101203T034655Z:20101203T034715Z:10:20
report-high:physical-memory:mem_default:[resource]:33423360K
report-high:physical-memory:mem_default:[total]:5329836K:15.94%:-:-
report-high:physical-memory:mem_default:[system]:2856556K:8.54%:-:-
report-high:physical-memory:mem_default:zoneD:2282968K:6.83%:3145728K:72.57%
report-high:physical-memory:mem_default:global:144608K:0.43%:-:-
report-high:physical-memory:mem_default:zoneA:45576K:0.13%:-:-
report-high:footer:20101203T034715Z10:20
Even a small number of options can generate a great deal of output:
GZ$ zonestat -p -r psets -R high 10 2
interval:header:since-last-interval:20101203T035201Z:1:10
interval:processor-set:default-pset:pset_default:[resource]:24:24:1:-:0-04-03.01
interval:processor-set:default-pset:pset_default:[total]:0.09:0.38%:-:-:-:-:-:0-00-00.92
interval:processor-set:default-pset:pset_default:[system]:0.01:0.06%:-:-:-:-:-:0-00-00.15
interval:processor-set:default-pset:pset_default:global:0.07:0.31%:-:-:-:-:-:0-00-00.77
interval:processor-set:dedicated-cpu:zoneD:[resource]:4:4:4:4:0-00-40.50
interval:processor-set:dedicated-cpu:zoneD:[total]:1.00:25.01%:-:-:-:-:-:0-00-10.13
interval:processor-set:dedicated-cpu:zoneD:[system]:0.00:0.05%:-:-:-:-:-:0-00-00.02
interval:processor-set:dedicated-cpu:zoneD:zoneD:0.99:24.95%:-:-:-:-:-:0-00-10.10
interval:processor-set:dedicated-cpu:zoneA:[resource]:4:4:4:4:0-00-40.50
interval:processor-set:dedicated-cpu:zoneA:[total]:0.00:0.01%:-:-:-:-:-:0-00-00.00
interval:processor-set:dedicated-cpu:zoneA:[system]:0.00:0.00%:-:-:-:-:-:0-00-00.00
interval:processor-set:dedicated-cpu:zoneA:zoneA:0.00:0.01%:-:-:-:-:-:0-00-00.00
interval:footer:20101203T035201Z10:10
interval:header:since-last-interval:20101203T035211Z:2:20
interval:processor-set:default-pset:pset_default:[resource]:24:24:1:-:0-04-03.08
interval:processor-set:default-pset:pset_default:[total]:0.07:0.32%:-:-:-:-:-:0-00-00.79
interval:processor-set:default-pset:pset_default:[system]:0.01:0.06%:-:-:-:-:-:0-00-00.15
interval:processor-set:default-pset:pset_default:global:0.06:0.26%:-:-:-:-:-:0-00-00.63
interval:processor-set:dedicated-cpu:zoneD:[resource]:4:4:4:4:0-00-40.51
interval:processor-set:dedicated-cpu:zoneD:[total]:1.00:25.01%:-:-:-:-:-:0-00-10.13
interval:processor-set:dedicated-cpu:zoneD:[system]:0.00:0.12%:-:-:-:-:-:0-00-00.04
interval:processor-set:dedicated-cpu:zoneD:zoneD:0.99:24.89%:-:-:-:-:-:0-00-10.08
interval:processor-set:dedicated-cpu:zoneA:[resource]:4:4:4:4:0-00-40.51
interval:processor-set:dedicated-cpu:zoneA:[total]:0.00:0.01%:-:-:-:-:-:0-00-00.00
interval:processor-set:dedicated-cpu:zoneA:[system]:0.00:0.00%:-:-:-:-:-:0-00-00.00
interval:processor-set:dedicated-cpu:zoneA:zoneA:0.00:0.01%:-:-:-:-:-:0-00-00.00
interval:footer:20101203T035211Z10:20
report-high:header:20101203T035151Z:20101203T035211Z:10:20
report-high:processor-set:default-pset:pset_default:[resource]:24:24:1:-:0-04-03.08
report-high:processor-set:default-pset:pset_default:[total]:0.09:0.38%:-:-:-:-:-:0-00-00.92
report-high:processor-set:default-pset:pset_default:[system]:0.01:0.06%:-:-:-:-:-:0-00-00.15
report-high:processor-set:default-pset:pset_default:global:0.07:0.31%:-:-:-:-:-:0-00-00.77
report-high:processor-set:dedicated-cpu:zoneD:[resource]:4:4:4:4:0-00-40.51
report-high:processor-set:dedicated-cpu:zoneD:[total]:1.00:25.07%:-:-:-:-:-:0-00-10.15
report-high:processor-set:dedicated-cpu:zoneD:[system]:0.00:0.12%:-:-:-:-:-:0-00-00.04
report-high:processor-set:dedicated-cpu:zoneD:zoneD:0.99:24.95%:-:-:-:-:-:0-00-10.10
report-high:processor-set:dedicated-cpu:zoneA:[resource]:4:4:4:4:0-00-40.51
report-high:processor-set:dedicated-cpu:zoneA:[total]:0.00:0.01%:-:-:-:-:-:0-00-00.00
report-high:processor-set:dedicated-cpu:zoneA:[system]:0.00:0.00%:-:-:-:-:-:0-00-00.00
report-high:processor-set:dedicated-cpu:zoneA:zoneA:0.00:0.01%:-:-:-:-:-:0-00-00.00
report-high:footer:20101203T035211Z10:20
With parseable output, you can easily write scripts that consume the output. Those scripts can further analyze the data, draw colorful graphs, and perform other data manipulation.

Miscellaneous Comments

A few parting notes:
  1. You can run zonestat in any zone. It will only receive data that should be visible to that zone. For example, when run in a non-global zone, zonestat will only display data about the processor set on which that zone's processes run.
  2. Zonestat gets all of the data from the zonestatd daemon, which is part of the service svc:/system/zones-monitoring:default. That service is enabled by default. Because that service is managed by SMF, if for any reason zonestatd stops, SMF will restart it.
  3. You can specify the interval at which zonestatd gathers data by setting the zones-monitoring:default service property config/sample_interval.

Summary

The new zonestat will provide hours of entertainment. It also provides answers to countless questions regarding your zones' use of system resources. Armed with that information, you can improve your understanding of the resource consumption of those zones, and improve the use of resource controls to ensure predictable performance of your workloads.

I hope that you have enjoyed learning about zonestat as much as I did. Check back during the first week of January for information on other new observability tools in Solaris 11 Express!

<script type="text/javascript"> var sc_project=2359564; var sc_invisible=1; var sc_security="22b325fd"; var sc_https=1; var sc_remove_link=1; var scJsHost = (("https:" == document.location.protocol) ? "https://secure." : "http://www."); document.write("");</script>

counter for tumblr

Wednesday Dec 01, 2010

Solaris 11 Express Podcast

A brief podcast discusses some of the major enhancements in Solaris 11 Express.

About

Jeff Victor writes this blog to help you understand Oracle's Solaris and virtualization technologies.

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today