Dedicated CPUs in zones - a small RM exercise

A small RM exercise

Today's blog is about an exercise with resource management using Logical Domains and Solaris Containers. Nothing earth-shattering, or even novel, but an illustration on how these technologies interact, and how resource management looks when dedicated CPUs are used with Containers.

The problem statement

I needed to demonstrate the interaction of Solaris Containers and dedicated CPUs for a customer. They wanted zones to be set up with dedicated CPUs so they could see what visibility zones had to CPU resources.

Lab environment

Fortunately I have access to a small T1000 server and logical domains, and set up a domain with multiple CPUs. (In these examples, "primary" in screen-scraped text indicates that the terminal session is in the Control Domain, and "global" indicates that the terminal session is in the guest domain's global zone.)

primary # ldm set-mem 2g ldom1
primary # ldm set-vcpu 8 ldom1
primary # ldm bind ldom1
primary # ldm list
NAME             STATE      FLAGS   CONS    VCPU  MEMORY   UTIL  UPTIME
primary          active     -n-cv-  SP      4     3G       0.5%  2d 11h 12m
ldom1            bound      ------  5000    8     2G             
ldom2            inactive   ------          4     1G             
ldom3            inactive   ------          4     1G             
primary # ldm start ldom1
LDom ldom1 started

Boot up the lab system

After firing up the domain, I connected to its console (as shown above, its virtual console is connected to port 5000), logged into it and displayed some virtual configuration data. Note that I have this domain set up to require manual boot. That's useful in a lab or training scenario, but normally you would let the domain boot up Solaris on the ldm start. In that case I could have just waited a few seconds for boot to complete (booting a logical domain is very fast since physical devices don't have to be probed) and simply used the ssh command to connect directly to the domain. Here, I get the OpenBoot "ok" prompt and then boot Solaris.

As expected, the domain sees the 8 virtual CPUs defined to it (look for the psrinfo output below). It also has 2 (virtual) NICS bound to different virtual switches connected to different physical networks. The network configuration isn't germane to today's exercise, but it's worth mentioning because it illustrates how you can pass separate physical network connections to nested virtual environments.

primary $ telnet localhost 5000
Trying 127.0.0.1...
Connected to localhost.
Escape character is '\^]'.

Connecting to console "ldom1" in group "ldom1" ....
Press ~? for control options ..

Sun Fire(TM) T1000, No Keyboard
Copyright 2009 Sun Microsystems, Inc.  All rights reserved.
OpenBoot 4.30.3, 2048 MB memory available, Serial #83492552.
Ethernet address 0:14:4f:f9:fe:c8, Host ID: 84f9fec8.

{0} ok boot
Boot device: /virtual-devices@100/channel-devices@200/disk@0:a  File and args: 
SunOS Release 5.10 Version Generic_139555-08 64-bit
Copyright 1983-2009 Sun Microsystems, Inc.  All rights reserved.
Use is subject to license terms.
Hostname: t1ldom1
Reading ZFS config: done.
Mounting ZFS filesystems: (8/8)

t1ldom1 console login: root
Password: 
Last login: Tue Sep 15 17:31:05 on console
Sun Microsystems Inc.   SunOS 5.10      Generic January 2005
global # psrinfo
0       on-line   since 10/01/2009 19:23:09
1       on-line   since 10/01/2009 19:23:11
2       on-line   since 10/01/2009 19:23:11
3       on-line   since 10/01/2009 19:23:11
4       on-line   since 10/01/2009 19:23:11
5       on-line   since 10/01/2009 19:23:11
6       on-line   since 10/01/2009 19:23:11
7       on-line   since 10/01/2009 19:23:11
global # ifconfig -a
lo0: flags=2001000849 mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000 
vnet0: flags=201000843 mtu 1500 index 2
        inet 192.168.2.101 netmask ffffff00 broadcast 192.168.2.255
        ether 0:14:4f:fb:f8:a4 
vnet1: flags=201000843 mtu 1500 index 3
        inet 129.153.20.144 netmask ffffff00 broadcast 129.153.20.255
        ether 0:14:4f:fa:3b:c9 

This domain also has a zone named u4z1 which I had migrated (via zoneadm detach and zoneadm attach) from an older update level of Solaris 10. The zone has shared IP access to each of the logical domain's virtual network devices, hence access to the different physical networks the machine is connected to.

global # zoneadm list -civ
  ID NAME             STATUS     PATH                           BRAND    IP    
   0 global           running    /                              native   shared
   - u4z1             installed  /zones/u4z1                    native   shared
global # zonecfg -z u4z1 info
zonename: u4z1
zonepath: /zones/u4z1
brand: native
autoboot: false
bootargs: 
pool: 
limitpriv: 
scheduling-class: 
ip-type: shared
inherit-pkg-dir:
	dir: /lib
inherit-pkg-dir:
	dir: /platform
inherit-pkg-dir:
	dir: /sbin
inherit-pkg-dir:
	dir: /usr
net:
	address: 192.168.2.222
	physical: vnet0
	defrouter not specified
net:
	address: 129.153.20.232
	physical: vnet1
	defrouter not specified

View from within the zone before resource management applied

At this point, I'll boot up the zone u4z1 and demonstrate that it has access to all the CPUs defined for this logical domain, and coincidentally, access to the network devices.

global # zlogin -C u4z1
[Connected to zone 'u4z1' console]

[NOTICE: Zone booting up]


SunOS Release 5.10 Version Generic_139555-08 64-bit
Copyright 1983-2009 Sun Microsystems, Inc.  All rights reserved.
Use is subject to license terms.
Hostname: u4z1
Reading ZFS config: done.

u4z1 console login: root
Password: 
Last login: Thu Aug 27 16:47:17 on console
Oct  1 19:27:05 u4z1 login: ROOT LOGIN /dev/console
Sun Microsystems Inc.   SunOS 5.10      Generic January 2005
# ifconfig -a
lo0:1: flags=2001000849 mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000 
vnet0:1: flags=201000843 mtu 1500 index 2
        inet 192.168.2.222 netmask ffffff00 broadcast 192.168.2.255
vnet1:1: flags=201000843 mtu 1500 index 3
        inet 129.153.20.232 netmask ffffff00 broadcast 129.153.20.255
# psrinfo
0       on-line   since 10/01/2009 19:23:09
1       on-line   since 10/01/2009 19:23:11
2       on-line   since 10/01/2009 19:23:11
3       on-line   since 10/01/2009 19:23:11
4       on-line   since 10/01/2009 19:23:11
5       on-line   since 10/01/2009 19:23:11
6       on-line   since 10/01/2009 19:23:11
7       on-line   since 10/01/2009 19:23:11

If you're keeping score: the physical machine has 24 CPUs (6 cores of 4 virtual CPUs each), and this domain has 8 of those CPUs, and the zone within it can see all of them.

Apply dedicated CPUs to the zone

Now, I go back to the global zone (still in the logical domain, remember) and add a dedicated-cpu stanza to the definition of the u4z1 zone. This sets up the zone so it has between 1 and 4 CPUs for its exclusive use.

global # zonecfg -z u4z1
zonecfg:u4z1> add dedicated-cpu
zonecfg:u4z1:dedicated-cpu> set ncpus=1-4
zonecfg:u4z1:dedicated-cpu> set importance=2
zonecfg:u4z1:dedicated-cpu> end
zonecfg:u4z1> verify
zonecfg:u4z1> commit
zonecfg:u4z1> exit
global # zonecfg -z u4z1 info
zonename: u4z1
zonepath: /zones/u4z1
brand: native
autoboot: false
bootargs: 
pool: 
limitpriv: 
scheduling-class: 
ip-type: shared
inherit-pkg-dir:
	dir: /lib
inherit-pkg-dir:
	dir: /platform
inherit-pkg-dir:
	dir: /sbin
inherit-pkg-dir:
	dir: /usr
net:
	address: 192.168.2.222
	physical: vnet0
	defrouter not specified
net:
	address: 129.153.20.232
	physical: vnet1
	defrouter not specified
dedicated-cpu:
	ncpus: 1-4
	importance: 2

Okay, I've changed the definition, let's recycle the zone. Oh, I forgot to enable the service that automatically shifts the number of CPUs owned by the zone between its lower and upper bounds. This is a really helpful feature: when the zone is CPU-busy, Solaris provides it CPUs up to the specified maximum number. When the zone is idle, it removes CPUs until it reaches the lower limit, which makes the CPUs available to other zones. Without the svc:/system/pools/dynamic service turned on, the zone gets the upper bound of dedicated CPUs. I can turn the dynamic pool service some other time, as it's not needed for this demo.

global # zoneadm -z u4z1 halt
global # zoneadm -z u4z1 boot
zoneadm: zone 'u4z1': WARNING: A range of dedicated-cpus has been specified
zoneadm: zone 'u4z1': but the dynamic pool service is not enabled.
zoneadm: zone 'u4z1': The system will not dynamically adjust the
zoneadm: zone 'u4z1': processor allocation within the specified range
zoneadm: zone 'u4z1': until svc:/system/pools/dynamic is enabled.
zoneadm: zone 'u4z1': See poold(1M).
global # svcs -xv svc:/system/pools/dynamic
svc:/system/pools/dynamic:default (dynamic resource pools)
 State: disabled since Thu Oct 01 19:23:30 2009
Reason: Disabled by an administrator.
   See: http://sun.com/msg/SMF-8000-05
   See: man -M /usr/share/ man -s 1M poold
Impact: This service is not running. 

The pool is under the covers

Under the covers, Solaris is building a "resource pool" that exists for the duration of the zone being booted up. You can do the same thing with the pooladm and poolcfg commands, but the dedicated-cpu syntax does it for you with much less effort on your part. This usability enhancement was delivered to Solaris 10 some two years ago!

Here's a view from the global zone of the resource pool environment created for you. There's a resource pool created by appending the name of the zone to the string SUNWtmp_, bound to a like-named processor set ("pset") with between 1 and 4 CPUs. Four of the eight CPUs owned by the domain are associated with this processor set, and the remaining CPUs are owned by a default resource pool and processor set.

global # poolcfg -c 'info' -d           

system default
	string	system.comment 
	int	system.version 1
	boolean	system.bind-default true
	string	system.poold.objectives wt-load

	pool SUNWtmp_u4z1
		int	pool.sys_id 1
		boolean	pool.active true
		boolean	pool.default false
		int	pool.importance 2
		string	pool.comment 
		boolean	pool.temporary true
		pset	SUNWtmp_u4z1

	pool pool_default
		int	pool.sys_id 0
		boolean	pool.active true
		boolean	pool.default true
		int	pool.importance 1
		string	pool.comment 
		pset	pset_default

	pset SUNWtmp_u4z1
		int	pset.sys_id 1
		boolean	pset.default false
		uint	pset.min 1
		uint	pset.max 4
		string	pset.units population
		uint	pset.load 361
		uint	pset.size 4
		string	pset.comment 
		boolean	pset.temporary true

		cpu
			int	cpu.sys_id 1
			string	cpu.comment 
			string	cpu.status on-line

		cpu
			int	cpu.sys_id 0
			string	cpu.comment 
			string	cpu.status on-line

		cpu
			int	cpu.sys_id 3
			string	cpu.comment 
			string	cpu.status on-line

		cpu
			int	cpu.sys_id 2
			string	cpu.comment 
			string	cpu.status on-line

	pset pset_default
		int	pset.sys_id -1
		boolean	pset.default true
		uint	pset.min 1
		uint	pset.max 65536
		string	pset.units population
		uint	pset.load 2
		uint	pset.size 4
		string	pset.comment 

		cpu
			int	cpu.sys_id 5
			string	cpu.comment 
			string	cpu.status on-line

		cpu
			int	cpu.sys_id 4
			string	cpu.comment 
			string	cpu.status on-line

		cpu
			int	cpu.sys_id 7
			string	cpu.comment 
			string	cpu.status on-line

		cpu
			int	cpu.sys_id 6
			string	cpu.comment 
			string	cpu.status on-line

Now when I boot the zone up it has access to only 4 CPUs of the 8 defined for this logical domain. You can use this to control the resources allocated to a zone, or to control the number of CPUs it has for software products that are licensed on a per-CPU charge.

[NOTICE: Zone halted]
[NOTICE: Zone booting up]

SunOS Release 5.10 Version Generic_139555-08 64-bit
Copyright 1983-2009 Sun Microsystems, Inc.  All rights reserved.
Use is subject to license terms.
Hostname: u4z1
Reading ZFS config: done.

u4z1 console login: root
Password: 
Oct  1 19:34:20 u4z1 login: ROOT LOGIN /dev/console
Last login: Thu Oct  1 19:27:04 on console
Sun Microsystems Inc.   SunOS 5.10      Generic January 2005
# psrinfo
0       on-line   since 10/01/2009 19:23:09
1       on-line   since 10/01/2009 19:23:11
2       on-line   since 10/01/2009 19:23:11
3       on-line   since 10/01/2009 19:23:11

Unlike many things with computers, CPU allocation doesn't have to be on a power-of-two basis:

global # zonecfg -z u4z1
zonecfg:u4z1> remove dedicated-cpu
zonecfg:u4z1> add dedicated-cpu
zonecfg:u4z1:dedicated-cpu> set ncpus=2-3
zonecfg:u4z1:dedicated-cpu> end
zonecfg:u4z1> verify
zonecfg:u4z1> commit
zonecfg:u4z1> exit

I restart the zone, and it again has the limited number of CPUs for its dedicated use.

[NOTICE: Zone halted]
[NOTICE: Zone booting up]

SunOS Release 5.10 Version Generic_139555-08 64-bit
Copyright 1983-2009 Sun Microsystems, Inc.  All rights reserved.
Use is subject to license terms.
Hostname: u4z1
Reading ZFS config: done.

u4z1 console login: root
Password: 
Last login: Thu Oct  1 19:34:20 on console
Sun Microsystems Inc.   SunOS 5.10      Generic January 2005
# psrinfo
0       on-line   since 10/01/2009 19:23:09
1       on-line   since 10/01/2009 19:23:11
2       on-line   since 10/01/2009 19:23:11

Can you combine dedicated CPUs and the Fair Share Scheduler?

See what happens if I try to use the Fair Share Scheduler (FSS) to assign CPU resources to this zone.

global # zonecfg -z u4z1
zonecfg:u4z1> set cpu-shares=5
zonecfg:u4z1> verify
rctl zone.cpu-shares and dedicated-cpu are incompatible.
u4z1: Incompatible settings
zonecfg:u4z1> remove dedicated-cpu
zonecfg:u4z1> set cpu-shares=5
zonecfg:u4z1> verify
zonecfg:u4z1> commit
zonecfg:u4z1> exit

It's not permitted: either you dedicate CPUs to a zone or you assign CPUs based on relative shares.

However, within a zone, the zone's root can use FSS to suballocate its CPU resources to projects using the project command. That's useful when a single zone hosts multiple applications.

CPU visibility in unmanaged zones

In this example. I booted up a second zone u4z2 (cloned from u4z1), and it did not have CPUs dedicated to it. When u4z1 had 3 dedicated CPUs, u4z2 had visibility to the remaining 5, as you would expect.

u7z2 console login: root
Password: 
Oct  1 20:10:58 u7z2 login: ROOT LOGIN /dev/console
Last login: Thu Oct  1 19:43:39 on console
Sun Microsystems Inc.   SunOS 5.10      Generic January 2005
# psrinfo
3       on-line   since 10/01/2009 19:23:11
4       on-line   since 10/01/2009 19:23:11
5       on-line   since 10/01/2009 19:23:11
6       on-line   since 10/01/2009 19:23:11
7       on-line   since 10/01/2009 19:23:11

I removed the dedicated-cpus from zone u4z1 and rebooted it, and zone u4z2 immediately saw the full set of 8 CPUs:

# psrinfo
0       on-line   since 10/01/2009 19:23:09
1       on-line   since 10/01/2009 19:23:11
2       on-line   since 10/01/2009 19:23:11
3       on-line   since 10/01/2009 19:23:11
4       on-line   since 10/01/2009 19:23:11
5       on-line   since 10/01/2009 19:23:11
6       on-line   since 10/01/2009 19:23:11
7       on-line   since 10/01/2009 19:23:11

What has happened is that the SUNWtmp_u4z1 resource pool has been removed, and all of its the CPUs returned to the default pool, so they are available to all the zones bound to it.

Summary

In this exercise we used dedicated CPUs to allocate CPU resources to zones. This can be used to provide predictable service to an application in a zone by giving it exclusive access to CPUs. It may also be easier to explain to IT clients than other resource management methods, since users can easily see that some number of CPUs "belong" to them, and their performance isn't dependent on the resource requirements of other applications running on the same server. Dedicated CPUs also can save considerable amounts of money for software licenses, for products that are licensed by the number of CPUs they run on. Dedicated CPUs are especially attractive on Sun's Chip Multithreading (CMT) servers, since they provide many CPUs at an extremely low price point with low space and environmental requirements.

The alternative to dedicating CPUs is to use the Fair Share Scheduler, which provides CPU power to a zone proportional to the number of shares the zone has, divided by the sum of shares given to all zones. Everything else being equal, if one zone has 10 shares and another zone has 20 shares, then the zone with 20 shares will get about twice the CPU power of the zone with 10. This only takes effect if there is no excess CPU capacity, and if both zones are able to consume all the CPU cycles made available to them.

The choice between using FSS or dedicated CPUs is based on both technology and policy: dedicated CPUs can be deterministic, easily explained, and save license fees for 3rd party software products, but can waste CPU power if a zone doesn't use the CPUs assigned to it. FSS is more flexible and provides more granular CPU resource allocation, but it doesn't provide guaranteed access. Solaris supports both styles of CPU resource management, in order to handle different customers priorities and business requirements.

Comments:

Post a Comment:
Comments are closed for this entry.
About

jsavit

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today