Monday Sep 21, 2015

Configuring Secure NFS in Solaris 11

This entry goes through the steps to build a Secure NFS server and client configuration. This includes the necessary DNS server configuration, creating a single Kerberos Key Distribution Center, and configuring NFS server and client to force access using Secure NFS.[Read More]

Secure NFS: Step O, as in Optional--NTP and DNS

Optional Network Time Protocol and Domain Name System Setup for Kerberos

Kerberos requires in-sync system time across all systems utilizing the service. Solaris Kerberos also requires direct access to DNS, as it does not use the local name service switch to select host name resolution. Thus I start with the steps to set up NTP and DNS, should you need either or both.

NTP

Since my setup is using Solaris Zones on a single system, they share the Global Zone's clock, and thus all the Zones' times are in sync. When using Kerberos across multiple systems, it is suggested to keep clock skew at a minimum. You may be doing this already for other reasons. If not, here is a simple Network Time Protocol configuration. Your routers may be valid NTP servers.

I add several server references in /etc/inet/ntp.conf, which I base off of the provided /etc/inet/ntp.client file.

global# diff /etc/inet/ntp.conf /etc/inet/ntp.client
49,53d48
< server 0.us.pool.ntp.org iburst
< server 1.us.pool.ntp.org iburst
< server 2.us.pool.ntp.org iburst
< server 3.us.pool.ntp.org iburst
global#

Replace the "x.us.pool.ntp.org" with your NTP servers' IP addresses or hostnames.

DNS

DNS infrastructure is required for Kerberos. Solaris' Kerberos is compiled to use DNS to do hostname lookups. See Kerberos, DNS, and the Naming Service.

If you have DNS servers you can update or even just reference for the nodes you need, please use them. I you don't have that or don't want to use them, here are steps to set up your own DNS service. This will include a single DNS server. More available DNS is out of the scope of this entry.

Create the DNS server Solaris Zone

My Zone configuration file is as follows.

global# cat dns.cfg
create -b
set brand=solaris
set zonepath=/zones/dns
set autoboot=false
set autoshutdown=shutdown
set ip-type=exclusive
add anet
set linkname=net0
set lower-link=net1
set configure-allowed-address=true
set link-protection=mac-nospoof
set mac-address=random
set vlan-id=17
end
add anet
set linkname=net1
set lower-link=net0
set configure-allowed-address=true
set link-protection=mac-nospoof
set mac-address=random
end
add admin
set user=steffen
set auths=login,manage,config
end
global#

The Zone has two network interfaces. The first (linkname=net0) is on VLAN ID 17 and is for this Secure NFS setup. The second network interface (linkname=net1) ties into my local network, and also my local DNS server (my broadband router at home, or my office network's DNS server--that I can't get modified for my hostnames.)

I also set the Zone up so that I can administer it without becoming root, though all the examples here are as root.

I configure the zone using the dns.cfg configuration file as input.

global# zonecfg -z dns -f dns.cfg
UX: /usr/sbin/usermod: steffen is currently logged in, some changes may not take effect until next login.
global#

Then to speed things up I clone the Zone from a "master" zone I created in advance. On my system a clone takes less than 20 seconds, while an install, with a local IPS repository, takes about 90 seconds. Your times will vary based on your system, type of storage, and the network connection to the IPS repository you use.

global# zoneadm -z dns clone -c dns_profile.xml kdcmaster
The following ZFS file system(s) have been created:
    pool1/zones/dns
Progress being logged to /var/log/zones/zoneadm.20150901T012022Z.dns.clone
Log saved in non-global zone as /zones/dns/root/var/log/zones/zoneadm.20150901T012022Z.dns.clone
global#

Lets boot the Zone.

global# zoneadm -z dns boot
global#

Once the Zone is up and running, I like to create a new boot environment, so that I if have to revert the changes I made, I can just reboot into the existing new Zone. While creating a new Zone is fast, this save some work, and it is also convenient later on to test additional changes.

global# zlogin dns
[Connected to zone 'dns' pts/8]
Oracle Corporation	SunOS 5.11	11.2	July 2015
root@dns:~#

root@dns:~# beadm create dns
root@dns:~# beadm activate dns
root@dns:~# reboot

[Connection to zone 'dns' pts/8 closed]
global#

Install the DNS server in the Solaris Zone

The DNS server package service/network/dns/bind is not installed by default, so we have to install it. We can verify it is not there by testing for the service.

global# zlogin dns
[Connected to zone 'dns' pts/8]
Oracle Corporation	SunOS 5.11	11.2	July 2015
root@dns:~#

root@dns:~# svcs *dns*
STATE          STIME    FMRI
disabled       21:26:25 svc:/network/dns/multicast:default
online         21:26:29 svc:/network/dns/client:default
root@dns:~#

root@dns:~# pkg install pkg:/service/network/dns/bind
           Packages to install:  1
            Services to change:  1
       Create boot environment: No
Create backup boot environment: No
DOWNLOAD                                PKGS         FILES    XFER (MB)   SPEED
Completed                                1/1         38/38      1.4/1.4  9.2M/s

PHASE                                          ITEMS
Installing new actions                         74/74
Updating package state database                 Done
Updating package cache                           0/0
Updating image state                            Done
Creating fast lookup database                   Done
Updating package cache                           2/2
root@dns:~#

root@dns:~# svcs *dns*
STATE          STIME    FMRI
disabled       21:26:25 svc:/network/dns/multicast:default
disabled       21:27:17 svc:/network/dns/server:default
online         21:26:29 svc:/network/dns/client:default
root@dns:~#

Configured the DNS server

With the DNS server package installed, it is time to create a basic DNS server configuration. I am using network 172.17.0.0/22 for some historical reasons. You can adjust to meet your own preferences or local requirements.

Some preliminary work for my configuration. My Zone configuration, if you remember, has two networks. The syconfig profile configured net0 for my private network. I still need to configure net1 on my standard network. I will use DHCP to get an address.

root@dns:~# dladm show-link
LINK                CLASS     MTU    STATE    OVER
net0                vnic      1500   up       ?
net1                vnic      1500   up       ?
root@dns:~#
root@dns:~# ipadm show-addr
ADDROBJ           TYPE     STATE        ADDR
lo0/v4            static   ok           127.0.0.1/8
net0/v4           static   ok           172.17.0.250/22
lo0/v6            static   ok           ::1/128
net0/v6           addrconf ok           fe80::8:20ff:fe90:a16e/10
root@dns:~#
root@dns:~# ipadm create-ip net1
root@dns:~#
root@dns:~# ipadm create-addr -T dhcp net1
net1/v4
root@dns:~#
root@dns:~# ipadm show-addr
ADDROBJ           TYPE     STATE        ADDR
lo0/v4            static   ok           127.0.0.1/8
net0/v4           static   ok           172.17.0.250/22
net1/v4           dhcp     ok           192.168.1.112/24
lo0/v6            static   ok           ::1/128
net0/v6           addrconf ok           fe80::8:20ff:fe90:a16e/10
root@dns:~#

It is time to create the master DNS file in /etc/named.conf. Some items of note include:

  • My two subnets, 172.17.0.0/22 and 192.168.1.0/24
  • I have ACLs to allow access from my two subnets
  • I set a forward to my local DNS server (my local router or my office network's DNS servers.)
  • I listen on the two networks listed in the ipadm output above.
  • This is set up for additional slave DNS servers, though I will not be showing the setup of that here.

Here is my final /etc/named.conf file.

root@dns:~# cat /etc/named.conf
//
// sample BIND configuration file
// taken from http://www.madboa.com/geek/soho-bind/
//

// Added acl per DNS setup at
// https://www.digitalocean.com/community/tutorials/how-to-configure-bind-as-a-caching-or-forwarding-dns-server-on-ubuntu-14-04
//
acl goodclients {
  172.17.0.0/22;
  192.168.1.0/24;
  localhost;
};

options {
  // tell named where to find files mentioned below
  directory "/var/named";
  // on a multi-homed host, you might want to tell named
  // to listen for queries only on certain interfaces
  listen-on { 127.0.0.1; 172.17.0.250/22; 192.168.1.112/24; };
  allow-query { goodclients; };
  forwarders { 192.168.1.1; };
};

// The single dot (.) is the root of all DNS namespace, so
// this zone tells named where to start looking for any
// name on the Internet
zone "." IN {
  // a hint type means that we've got to look elsewhere
  // for authoritative information
  type hint;
  file "named.root";
};

// Where the localhost hostname is defined
zone "localhost" IN {
  // a master type means that this server needn't look
  // anywhere else for information; the localhost buck
  // stops here.
  type master;
  file "zone.localhost";
  // don't allow dynamic DNS clients to update info
  // about the localhost zone
  allow-update { none; };
};

// Where the 127.0.0.0 network is defined
zone "0.0.127.in-addr.arpa" IN {
  type master;
  file "revp.127.0.0";
  allow-update { none; };
};

zone "steffentw.com" IN {
  // this is the authoritative server for
  // steffentw.com info
  type master;
  file "zone.com.steffentw";
  also-notify { 172.17.0.251; 172.17.0.252; };
};

zone "0.17.172.in-addr.arpa" {
  // this is the authoritative server for
  // the 172.17.0.0/22 network
  type master;
  file "revp.172.17.0.0";
  also-notify { 172.17.0.251; 172.17.0.252; };
};
root@dns:~#

Now I have to create or update the files pointed to be /etc/named.conf with my local hostnames.

root@dns:~# cd /var/named
root@dns:/var/named# ls
named.root        revp.172.17.0.0   zone.localhost
revp.127.0.0      zone.com.steffentw
root@dns:/var/named#
root@dns:/var/named# cat zone.com.steffentw
;
; dns zone for for steffentw.com
;
; 20150827	Hide _nfsv4idmapdomain to test domainname(1M) response
; 20150824	Removed CNAME for kdc to see if this is required
;
$ORIGIN steffentw.com.
$TTL 1M				; set to 1M for testing, was 1D
; any time you make a change to the domain, bump the
; "serial" setting below. the format is easy:
; YYYYMMDDI, with the I being an iterator in case you
; make more than one change during any one day
@	IN SOA   dns hostmaster (
			201508311 ; serial
			8H        ; refresh
			4M        ; retry
			1H        ; expire
			1D        ; minimum
			)
; dns.steffentw.com serves this domain as both the
; name server (NS) and mail exchange (MX)
		NS	dns
		MX	10 dns
; define domain functions with CNAMEs
depot           CNAME   dns
www             CNAME   dns
; for NFSv4 (2015.08.12)
;_nfsv4idmapdomain	IN TXT	"steffentw.com"
; just in case someone asks for localhost.steffentw.com
localhost	A	127.0.0.1
;
;	172.17.0.0/22 Infrastructure Administration Network
;
host1		A	172.17.0.101
host2		A	172.17.0.102
host3		A	172.17.0.103
host4		A	172.17.0.104
host5		A	172.17.0.105
host6		A	172.17.0.106
host7		A	172.17.0.107
host8		A	172.17.0.108
host9		A	172.17.0.109
zfs1		A	172.17.0.201
zfs2		A	172.17.0.202
zfs3		A	172.17.0.203
dns		A	172.17.0.250
kdc1		A	172.17.0.251
kdc2		A	172.17.0.252
kdc3		A	172.17.0.253
root@dns:/var/named#
root@dns:/var/named# cat revp.172.17.0.0
;
; reverse pointers for 172.17.0.0 subnet
;
$ORIGIN 0.16.172.in-addr.arpa.
$TTL 1D
@	IN SOA  dns.steffentw.com. hostmaster.steffentw.com. (
		201508311  ; serial
		28800      ; refresh (8 hours)
		14400      ; retry (4 hours)
		2419200    ; expire (4 weeks)
		86400      ; minimum (1 day)
		)
; define the authoritative name server
		NS	dns.steffentw.com.
;		NS	dns1.steffentw.com.
;		NS	dns2.steffentw.com.
;
;       172.17.0.0/22 Infrastructure Administration Network
;
101	PTR	host1.steffentw.com.
102	PTR	host2.steffentw.com.
103	PTR	host3.steffentw.com.
104	PTR	host4.steffentw.com.
105	PTR	host5.steffentw.com.
106	PTR	host6.steffentw.com.
107	PTR	host7.steffentw.com.
108	PTR	host8.steffentw.com.
109	PTR	host9.steffentw.com.
;
201	PTR	zfs1.steffentw.com.
202	PTR	zfs2.steffentw.com.
203	PTR	zfs3.steffentw.com.
;
250	PTR	dns.steffentw.com.
251	PTR	kdc1.steffentw.com.
252	PTR	kdc2.steffentw.com.
253	PTR	kdc3.steffentw.com.
root@dns:/var/named#

With those files created it is time to enable the DNS server. Keep an eye out on the console of the Zone in case you have errors.

root@dns:/var/named# svcs *dns*
STATE          STIME    FMRI
disabled       21:26:25 svc:/network/dns/multicast:default
disabled       21:27:17 svc:/network/dns/server:default
online         21:26:29 svc:/network/dns/client:default
root@dns:/var/named#
root@dns:/var/named# svcadm enable dns/server
root@dns:/var/named#
root@dns:/var/named# svcs *dns*
STATE          STIME    FMRI
disabled       21:26:25 svc:/network/dns/multicast:default
online         21:26:29 svc:/network/dns/client:default
online         21:44:31 svc:/network/dns/server:default
root@dns:/var/named#

Test the DNS server

Let us see if DNS really works.

root@dns:~# getent hosts kdc1
172.17.0.251	kdc1.steffentw.com
root@dns:~# getent hosts host1
172.17.0.101	host1.steffentw.com
root@dns:~#

A quick test to see if this Zone can do a DNS lookup for an external name.

root@dns:~# nslookup www.oracle.com
Server:		172.17.0.250
Address:	172.17.0.250#53

Non-authoritative answer:
www.oracle.com	canonical name = www.oracle.com.edgekey.net.
www.oracle.com.edgekey.net	canonical name = e7075.x.akamaiedge.net.
Name:	e7075.x.akamaiedge.net
Address: 23.66.214.140

root@dns:~#
root@dns:~# getent hosts www.oracle.com
23.66.214.140	e7075.x.akamaiedge.net www.oracle.com www.oracle.com.edgekey.net
root@dns:~#

Summary and Next Step

With NTP and DNS working, the next step is to build the Key Distribution Server. Either go to KDC setup or back to the introduction.

Secure NFS: Step 1--Setting Up the Kerberos KDC

Kerberos KDC

With DNS set up, the next service to configure is the Key Distribution Center. It will need to access DNS services.

Creating the KDC Zone

The Zone configuration is similar to the DNS server, with the interface using VLAN ID 17 in my setup.

global# cat kdc1.cfg
create -b
set brand=solaris
set zonepath=/zones/kdc1
set autoboot=false
set autoshutdown=shutdown
set ip-type=exclusive
add anet
set linkname=net0
set lower-link=net1
set configure-allowed-address=true
set link-protection=mac-nospoof
set mac-address=random
set vlan-id=17
end
add admin
set user=steffen
set auths=login,manage,config
end
global#

Since the KDC must use DNS, lets put that into the sysconfig profile.

global# more kdc1_profile.xml
...
  <service version="1" type="service" name="network/install">
    <instance enabled="true" name="default">
      <property_group type="application" name="install_ipv6_interface">
        <propval type="astring" name="stateful" value="yes"/>
        <propval type="astring" name="address_type" value="addrconf"/>
        <propval type="astring" name="name" value="net0/v6"/>
        <propval type="astring" name="stateless" value="yes"/>
      </property_group>
      <property_group type="application" name="install_ipv4_interface">
        <propval type="net_address_v4" name="static_address" value="172.17.0.251 /24"/>
        <propval type="astring" name="name" value="net0/v4"/>
        <propval type="astring" name="address_type" value="static"/>
      </property_group>
    </instance>
  </service>
  <service version="1" type="service" name="network/physical">
    <instance enabled="true" name="default">
      <property_group type="application" name="netcfg">
        <propval type="astring" name="active_ncp" value="DefaultFixed"/>
      </property_group>
    </instance>
  </service>
  <service version="1" type="service" name="system/name-service/switch">
    <property_group type="application" name="config">
      <propval type="astring" name="default" value="files"/>
      <propval type="astring" name="host" value="files dns"/>
    </property_group>
    <instance enabled="true" name="default"/>
  </service>
  <service version="1" type="service" name="network/dns/client">
    <property_group type="application" name="config">
      <property type="net_address" name="nameserver">
        <net_address_list>
          <value_node value="172.17.0.250"/>
        </net_address_list>
      </property>
      <property type="astring" name="search">
        <astring_list>
          <value_node value="steffentw.com"/>
        </astring_list>
      </property>
    </property_group>
    <instance enabled="true" name="default"/>
  </service>
  ...
global#

Configure and clone the KDC Zone.

global# zonecfg -z kdc1 -f kdc1.cfg
UX: /usr/sbin/usermod: steffen is currently logged in, some changes may not take effect until next login.
global#
global#
global# zoneadm -z kdc1 clone -c kdc1_profile.xml kdcmaster
The following ZFS file system(s) have been created:
    pool1/zones/kdc1
Progress being logged to /var/log/zones/zoneadm.20150901T204046Z.kdc1.clone
Log saved in non-global zone as /zones/kdc1/root/var/log/zones/zoneadm.20150901T204046Z.kdc1.clone
global#
global# zoneadm -z kdc1 boot
global#

After logging into the KDC Zones, first verify that DNS is configured properly.

global#
global# zlogin kdc1
[Connected to zone 'kdc1' pts/8]
Oracle Corporation	SunOS 5.11	11.2	July 2015
root@kdc1:~#
root@kdc1:~# getent hosts host1
172.17.0.101	host1.steffentw.com
root@kdc1:~#

Installing the Kerberos Server Software

The necessary KDC package is not installed by default.

root@kdc1:~# svcs *krb5* ; svcs *kerb*
STATE          STIME    FMRI
STATE          STIME    FMRI
disabled       16:41:20 svc:/system/kerberos/install:default
root@kdc1:~#

Again I prefer to create an alternate boot environment. This time I will do it as part of the package installation.

root@kdc1:~# pkg install --be-name kdc system/security/kerberos-5
           Packages to install:   1
       Create boot environment: Yes
Create backup boot environment:  No
DOWNLOAD                                PKGS         FILES    XFER (MB)   SPEED
Completed                                1/1         41/41      0.7/0.7 27.9M/s

PHASE                                          ITEMS
Installing new actions                         90/90
Updating package state database                 Done
Updating package cache                           0/0
Updating image state                            Done
Creating fast lookup database                   Done
Updating package cache                           2/2

A clone of solaris-0 exists and has been updated and activated.
On the next boot the Boot Environment kdc will be
mounted on '/'.  Reboot when ready to switch to this updated BE.

Updating package cache                           2/2
root@kdc1:~#

A quick check on the BE, and then boot into it.

root@kdc1:~# beadm list
BE        Flags Mountpoint Space  Policy Created         
--        ----- ---------- -----  ------ -------         
kdc       R     -          95.45M static 2015-09-01 16:47
solaris-0 N     /          6.29M  static 2015-09-01 16:40
root@kdc1:~#
root@kdc1:~# reboot

[Connection to zone 'kdc1' pts/8 closed]
global#

First lets confirm the necessary services are there.

global# zlogin kdc1
[Connected to zone 'kdc1' pts/8]
Oracle Corporation	SunOS 5.11	11.2	July 2015
root@kdc1:~#
root@kdc1:~# svcs *krb5* ; svcs *kerb*
STATE          STIME    FMRI
disabled       16:48:22 svc:/network/security/krb5_prop:default
disabled       16:48:22 svc:/network/security/krb5kdc:default
STATE          STIME    FMRI
disabled       16:48:21 svc:/system/kerberos/install:default
root@kdc1:~#

Configuring the KDC

The first configuration step is to modify two files. I make a copy for backups and to compare the new to the original here.

root@kdc1:~# cd /etc/krb5/
root@kdc1:/etc/krb5#
root@kdc1:/etc/krb5# cp -p kdc.conf kdc.conf.orig
root@kdc1:/etc/krb5# cp -p krb5.conf krb5.conf.orig
root@kdc1:/etc/krb5#
root@kdc1:/etc/krb5# vi kdc.conf
root@kdc1:/etc/krb5#
root@kdc1:/etc/krb5# cat kdc.conf
#
#
# Copyright (c) 2008, Oracle and/or its affiliates. All rights reserved.
#

[kdcdefaults]
	kdc_ports = 88,750

[realms]
	___default_realm___ = {
		profile = /etc/krb5/krb5.conf
		database_name = /var/krb5/principal
		acl_file = /etc/krb5/kadm5.acl
		kadmind_port = 749
		max_life = 8h 0m 0s
		max_renewable_life = 7d 0h 0m 0s
		default_principal_flags = +preauth
 		master_key_type = des3-cbc-sha1-kd
 		supported_enctypes = des3-cbc-sha1-kd:normal
	}
root@kdc1:/etc/krb5#
root@kdc1:/etc/krb5# diff kdc.conf*
18,19d17
<  		master_key_type = des3-cbc-sha1-kd
<  		supported_enctypes = des3-cbc-sha1-kd:normal
root@kdc1:/etc/krb5#
root@kdc1:/etc/krb5# vi krb5.conf
root@kdc1:/etc/krb5#
root@kdc1:/etc/krb5# head -20 krb5.conf
#
#
# Copyright (c) 2007, Oracle and/or its affiliates. All rights reserved.
#

# krb5.conf template
# In order to complete this configuration file
# you will need to replace the ____ placeholders
# with appropriate values for your network and uncomment the
# appropriate entries.
#
[libdefaults]
#        default_realm = ___default_realm___
 	default_tgs_enctypes = des3-cbc-sha1-kd
 	default_tkt_enctypes = des3-cbc-sha1-kd
 	permitted_enctypes = des3-cbc-sha1-kd
 	allow_weak_enctypes = false


[realms]
root@kdc1:/etc/krb5#
root@kdc1:/etc/krb5# diff krb5.conf*
14,17d13
<  	default_tgs_enctypes = des3-cbc-sha1-kd
<  	default_tkt_enctypes = des3-cbc-sha1-kd
<  	permitted_enctypes = des3-cbc-sha1-kd
<  	allow_weak_enctypes = false
19d14
<
root@kdc1:/etc/krb5#

Since my sample domain name is steffentw.com, my Kerberos realm is STEFFENTW.COM. Here I create the master KDC. It will prompt for two sets of password, make sure the remember them. The admin password will be required on all the clients.

root@kdc1:/etc/krb5# kdcmgr -a kws/admin -r STEFFENTW.COM create master

Starting server setup
---------------------------------------------------

Setting up /etc/krb5/kdc.conf.

Setting up /etc/krb5/krb5.conf.

Initializing database '/var/krb5/principal' for realm 'STEFFENTW.COM',
master key name 'K/M@STEFFENTW.COM'
You will be prompted for the database Master Password.
It is important that you NOT FORGET this password.
Enter KDC database master key: enter master password here
Re-enter KDC database master key to verify: enter master password here

Authenticating as principal root/admin@STEFFENTW.COM with password.
WARNING: no policy specified for kws/admin@STEFFENTW.COM; defaulting to no policy
Enter password for principal "kws/admin@STEFFENTW.COM": enter admin password here
Re-enter password for principal "kws/admin@STEFFENTW.COM": enter admin password here
Principal "kws/admin@STEFFENTW.COM" created.

Setting up /etc/krb5/kadm5.acl.

---------------------------------------------------
Setup COMPLETE.

root@kdc1:/etc/krb5#

Once the configuration is complete, I quickly check to make sure it looks OK. I especially look for kadmin:default to be online.

root@kdc1:/etc/krb5# kdcmgr status

KDC Status Information
--------------------------------------------
svc:/network/security/krb5kdc:default (Kerberos key distribution center)
 State: online since September  1, 2015 04:51:06 PM EDT
   See: man -M /usr/share/man -s 1M krb5kdc
   See: /var/svc/log/network-security-krb5kdc:default.log
Impact: None.

KDC Master Status Information
--------------------------------------------
svc:/network/security/kadmin:default (Kerberos administration daemon)
 State: online since September  1, 2015 04:51:07 PM EDT
   See: man -M /usr/share/man -s 1M kadmind
   See: /var/svc/log/network-security-kadmin:default.log
Impact: None.

Transaction Log Information
--------------------------------------------

Kerberos update log (/var/krb5/principal.ulog)
Update log dump :
	Log version # : 1
	Log state : Stable
	Entry block size : 2048
	Number of entries : 3
	First serial # : 1
	Last serial # : 3
	First time stamp : Tue Sep  1 16:51:06 2015
	Last time stamp : Tue Sep  1 16:51:06 2015


Kerberos Related File Information
--------------------------------------------
(will display any missing files below)

root@kdc1:/etc/krb5#

Enabling Kerberos Client Configuration

With the KDC set up, the next step is to make is easier to configure the Kerberos clients. Two files are required, and by putting them into a location that is shared via NFS, setting up the clients will be very easy.

Step 1 is to create a mountpoint.

root@kdc1:/etc/krb5# zfs create -o mountpoint=/share -o share.nfs=on rpool/share
root@kdc1:/etc/krb5#
root@kdc1:/etc/krb5# share
rpool_share	/share	nfs	sec=sys,rw	
root@kdc1:/etc/krb5#

Step 2 is to create the file kcprofile

root@kdc1:/etc/krb5# mkdir /share/krb5
root@kdc1:/etc/krb5#
root@kdc1:/etc/krb5# vi /share/krb5/kcprofile
root@kdc1:/etc/krb5#
root@kdc1:/etc/krb5# cat /share/krb5/kcprofile
REALM STEFFENTW.COM
KDC kdc1.steffentw.com
ADMIN kws
FILEPATH /net/kdc1.steffentw.com/share/krb5/krb5.conf
NFS 1
DNSLOOKUP none
root@kdc1:/etc/krb5#
root@kdc1:/etc/krb5# cp /etc/krb5/krb5.conf /share/krb5/
root@kdc1:/etc/krb5#
root@kdc1:/etc/krb5# cat /share/krb5/krb5.conf 
[libdefaults]
	default_realm = STEFFENTW.COM

[realms]
	STEFFENTW.COM = {
		kdc = kdc1.steffentw.com
		admin_server = kdc1.steffentw.com
	}

[domain_realm]
	.steffentw.com = STEFFENTW.COM

[logging]
	default = FILE:/var/krb5/kdc.log
	kdc = FILE:/var/krb5/kdc.log
	kdc_rotate = {
		period = 1d
		versions = 10
	}

[appdefaults]
	kinit = {
		renewable = true
		forwardable = true
	}
root@kdc1:/etc/krb5#

Summary and Next Step

With the KDC set up, the next step is to create the first client and configure secure NFS. Either go to NFS Server Setup or back to the introduction.

Secure NFS: Step 2--First Keberos Client--NFS Server

Secure NFS Server

With our Kerberos KDC set up, it is time to build the NFS server. First step is creating another Solaris Zone similar to the previous ones.

Creating a NFS Server Zone

global# cat zfs1.cfg
create -b
set brand=solaris
set zonepath=/zones/zfs1
set autoboot=false
set autoshutdown=shutdown
set ip-type=exclusive
add anet
set linkname=net0
set lower-link=net2
set configure-allowed-address=true
set link-protection=mac-nospoof
set mac-address=random
set vlan-id=17
end
add admin
set user=steffen
set auths=login,manage,config
end
global#
global# zonecfg -z zfs1 -f zfs1.cfg
UX: /usr/sbin/usermod: steffen is currently logged in, some changes may not take effect until next login.
global#
global# zoneadm -z zfs1 clone -c zfs1_profile.xml kdcmaster
The following ZFS file system(s) have been created:
    pool1/zones/zfs1
Progress being logged to /var/log/zones/zoneadm.20150901T210134Z.zfs1.clone
Log saved in non-global zone as /zones/zfs1/root/var/log/zones/zoneadm.20150901T210134Z.zfs1.clone
global#
global# zoneadm -z zfs1 boot
global#

Configuring the Zone as a Kerberos Client

We also follow the same steps as for the previous KDC client.

global# zlogin zfs1
[Connected to zone 'zfs1' pts/10]
Oracle Corporation	SunOS 5.11	11.2	July 2015
root@zfs1:~#
root@zfs1:~# ping kdc1
kdc1 is alive
root@zfs1:~#
root@zfs1:~# cat /net/kdc1/share/krb5/kcprofile
REALM STEFFENTW.COM
KDC kdc1.steffentw.com
ADMIN kws
FILEPATH /net/kdc1.steffentw.com/share/krb5/krb5.conf
NFS 1
DNSLOOKUP none
root@zfs1:~#
root@zfs1:~# head -5 /net/kdc1.steffentw.com/share/krb5/krb5.conf
[libdefaults]
	default_realm = STEFFENTW.COM

[realms]
	STEFFENTW.COM = {
root@zfs1:~#
root@zfs1:~# kclient -p /net/kdc1/share/krb5/kcprofile

Starting client setup

---------------------------------------------------

Setting up /etc/krb5/krb5.conf.

Copied /net/kdc1.steffentw.com/share/krb5/krb5.conf to /system/volatile/kclient/kclient-krb5conf.MYaafI.
Obtaining TGT for kws/admin ...
Password for kws/admin@STEFFENTW.COM: enter admin password here
kinit:  no ktkt_warnd warning possible

nfs/zfs1.steffentw.com entry ADDED to KDC database.
nfs/zfs1.steffentw.com entry ADDED to keytab.

host/zfs1.steffentw.com entry ADDED to KDC database.
host/zfs1.steffentw.com entry ADDED to keytab.

---------------------------------------------------
Setup COMPLETE.

root@zfs1:~#
root@zfs1:~# klist -k
Keytab name: FILE:/etc/krb5/krb5.keytab
KVNO Principal
---- --------------------------------------------------------------------------
   2 nfs/zfs1.steffentw.com@STEFFENTW.COM
   2 nfs/zfs1.steffentw.com@STEFFENTW.COM
   2 nfs/zfs1.steffentw.com@STEFFENTW.COM
   2 nfs/zfs1.steffentw.com@STEFFENTW.COM
   2 host/zfs1.steffentw.com@STEFFENTW.COM
   2 host/zfs1.steffentw.com@STEFFENTW.COM
   2 host/zfs1.steffentw.com@STEFFENTW.COM
   2 host/zfs1.steffentw.com@STEFFENTW.COM
root@zfs1:~#

Configuring the NFS Server File System

With the NFS server a Kerberos client, now create a ZFS file system that is exported as an NFS share requiring Kerberos privacy settings (the "krb5p" setting.)

root@zfs1:~# zfs create -o mountpoint=/secure -o share.nfs=on -o share.nfs.sec=krb5p rpool/secure
root@zfs1:~# share
rpool_secure	/secure	nfs	sec=krb5p,rw	
root@zfs1:~#

Then create a file with some easily recognized content.

root@zfs1:~# echo "The quick brown fox jumps over the lazy dog." > /secure/fox.txt
root@zfs1:~#
root@host1:~# cat /secure/fox.txt
The quick brown fox jumps over the lazy dog.
root@zfs1:~#

Summary and Next Step

With the NFS server running, the next step is to create an NFS client. Either go to NFS Client Setup or back to the introduction.

Secure NFS: Step 3--The Secure NFS Client

Secure NFS Client

We are getting close to a fully completed configuration. The next item is the client.

Build the NFS Client Zone as a KDC Client

global# cat host1.cfg
create -b
set brand=solaris
set zonepath=/zones/host1
set autoboot=false
set autoshutdown=shutdown
set ip-type=exclusive
add anet
set linkname=net0
set lower-link=net2
set configure-allowed-address=true
set link-protection=mac-nospoof
set mac-address=random
set vlan-id=17
end
add admin
set user=steffen
set auths=login,manage,config
end
global#
global# zoneadm -z host1 clone -c host1_profile.xml kdcmaster
The following ZFS file system(s) have been created:
    pool1/zones/host1
Progress being logged to /var/log/zones/zoneadm.20150901T213207Z.host1.clone
Log saved in non-global zone as /zones/host1/root/var/log/zones/zoneadm.20150901T213207Z.host1.clone
global#
global# zlogin host1
[Connected to zone 'host1' pts/8]
Oracle Corporation	SunOS 5.11	11.2	July 2015
root@host1:~#
root@host1:~# ping kdc1
kdc1 is alive
root@host1:~#
root@host1:~# cat /net/kdc1/share/krb5/kcprofile
REALM STEFFENTW.COM
KDC kdc1.steffentw.com
ADMIN kws
FILEPATH /net/kdc1.steffentw.com/share/krb5/krb5.conf
NFS 1
DNSLOOKUP none
root@host1:~#
root@host1:~# kclient -p /net/kdc1/share/krb5/kcprofile

Starting client setup

---------------------------------------------------

Setting up /etc/krb5/krb5.conf.

Copied /net/kdc1.steffentw.com/share/krb5/krb5.conf to /system/volatile/kclient/kclient-krb5conf.ToaOPV.
Obtaining TGT for kws/admin ...
Password for kws/admin@STEFFENTW.COM: enter admin password here
kinit:  no ktkt_warnd warning possible

nfs/host1.steffentw.com entry ADDED to KDC database.
nfs/host1.steffentw.com entry ADDED to keytab.

host/host1.steffentw.com entry ADDED to KDC database.
host/host1.steffentw.com entry ADDED to keytab.

---------------------------------------------------
Setup COMPLETE.

root@host1:~#

Demonstrate the NFS Client Working

The simples test is to just navigate to the /net/<server name> location.

root@host1:~# cat /net/zfs1/secure/fox.txt
The quick brown fox jumps over the lazy dog.
root@host1:~#

However, was this really an encrypted data transfer? One way to check is with snoop(1M).

root@host1:~# snoop -d net0 -r host zfs1 &
[1] 21547
root@host1:~# Using device net0 (promiscuous mode)

root@host1:~# cat /net/zfs1/secure/fox.txt
The quick brown fox jumps over the lazy dog.
root@host1:~# 172.17.0.101 -> 172.17.0.201 TCP D=2049 S=1023 Syn Seq=1000276621 Len=0 Win=32804 Options=<mss 1460,sackOK,tstamp 129311831 0,nop,wscale 5>
172.17.0.201 -> 172.17.0.101 TCP D=1023 S=2049 Syn Ack=1000276622 Seq=576217546 Len=0 Win=32806 Options=<sackOK,tstamp 129311831 129311831,mss 1460,nop,wscale 5>
172.17.0.101 -> 172.17.0.201 TCP D=2049 S=1023 Ack=576217547 Seq=1000276622 Len=0 Win=32806 Options=<nop,nop,tstamp 129311831 129311831>
...
172.17.0.101 -> 172.17.0.201 RPC RPCSEC_GSS C NFS ver(4) proc(1) (data encrypted)
172.17.0.201 -> 172.17.0.101 TCP D=1023 S=2049 Ack=1000276950 Seq=576217547 Len=0 Win=32796 Options=<nop,nop,tstamp 129311831 129311831>
172.17.0.201 -> 172.17.0.101 RPC RPCSEC_GSS R NFS ver(4) proc(1) (data encrypted)
172.17.0.101 -> 172.17.0.201 TCP D=2049 S=1023 Ack=576217959 Seq=1000276950 Len=0 Win=32806 Options=<nop,nop,tstamp 129311832 129311832>
...
172.17.0.101 -> 172.17.0.201 RPC RPCSEC_GSS C NFS ver(4) proc(1) (data encrypted)
172.17.0.201 -> 172.17.0.101 RPC RPCSEC_GSS R NFS ver(4) proc(1) (data encrypted)
...
root@host1:~# kill %1
root@host1:~#

To see the difference, lets create a second share that does not require Kerberos.

root@zfs1:~# zfs create -o mountpoint=/clear -o share.nfs=on rpool/clear
root@zfs1:~#
root@zfs1:~# share
rpool_secure	/secure	nfs	sec=krb5p,rw	
rpool_clear	/clear	nfs	sec=sys,rw	
root@zfs1:~#
root@zfs1:~# cp /secure/fox.txt /clear/
root@zfs1:~#

And run snoop with the option to dump all the data in each Ethernet frame. I like to use -x 0.

First using encrypted mountpoint.

root@host1:~# snoop -d net0 -r -x 0 host zfs1 &
[1] 21560
root@host1:~# Using device net0 (promiscuous mode)

root@host1:~# snoop -d net0 -r -x 0 host zfs1 &
[1] 21591
root@host1:~# Using device net0 (promiscuous mode)

root@host1:~# cat /net/zfs1/secure/fox.txt
The quick brown fox jumps over the lazy dog.
root@host1:~# 172.17.0.101 -> 172.17.0.201 TCP D=2049 S=48428 Syn Seq=788443968 Len=0 Win=64240 Options=<mss 1460,sackOK,tstamp 129469208 0,nop,wscale 1>

	   0: 0208 20e4 7813 0208 20ea 4c3d 0800 4500    .. .x... .L=..E.
	  16: 003c ea59 4000 4006 0000 ac11 0065 ac11    .<.Y@.@......e..
	  32: 00c9 bd2c 0801 2efe b340 0000 0000 a002    ...,.....@......
	  48: faf0 597f 0000 0204 05b4 0402 080a 07b7    ..Y.............
	  64: 8b18 0000 0000 0103 0301                   ..........

172.17.0.201 -> 172.17.0.101 TCP D=48428 S=2049 Syn Ack=788443969 Seq=2268877688 Len=0 Win=32806 Options=<sackOK,tstamp 129469208 129469208,mss 1460,nop,wscale 5>

	   0: 0208 20ea 4c3d 0208 20e4 7813 0800 4500    .. .L=.. .x...E.
	  16: 003c f568 4000 4006 ec02 ac11 00c9 ac11    .<.h@.@.........
	  32: 0065 0801 bd2c 873c 5378 2efe b341 a012    .e...,.<Sx...A..
	  48: 8026 c6b9 0000 0402 080a 07b7 8b18 07b7    .&..............
	  64: 8b18 0204 05b4 0103 0305                   ..........

172.17.0.101 -> 172.17.0.201 TCP D=2049 S=48428 Ack=2268877689 Seq=788443969 Len=0 Win=64436 Options=<nop,nop,tstamp 129469208 129469208>

	   0: 0208 20e4 7813 0208 20ea 4c3d 0800 4500    .. .x... .L=..E.
	  16: 0034 ea5a 4000 4006 0000 ac11 0065 ac11    .4.Z@.@......e..
	  32: 00c9 bd2c 0801 2efe b341 873c 5379 8010    ...,.....A.<Sy..
	  48: fbb4 5977 0000 0101 080a 07b7 8b18 07b7    ..Yw............
	  64: 8b18                                       ..

...

172.17.0.101 -> 172.17.0.201 RPC RPCSEC_GSS C NFS ver(4) proc(1) (data encrypted)

	   0: 0208 20e4 7813 0208 20ea 4c3d 0800 4500    .. .x... .L=..E.
	  16: 017c ea70 4000 4006 0000 ac11 0065 ac11    .|.p@.@......e..
	  32: 00c9 03ff 0801 4667 92c6 2d1f 25fc 8018    ......Fg..-.%...
	  48: 8026 5abf 0000 0101 080a 07b7 8b1b 07b7    .&Z.............
	  64: 8b1b 8000 0144 6e7d 0f68 0000 0000 0000    .....Dn}.h......
	  80: 0002 0001 86a3 0000 0004 0000 0001 0000    ................
	  96: 0006 0000 0018 0000 0001 0000 0000 0000    ................
	 112: 0002 0000 0003 0000 0004 1e00 0000 0000    ................
	 128: 0006 0000 001c 0404 04ff ffff ffff 0000    ................
	 144: 0000 15d8 2a96 8cb9 33d6 91df d5de 4ee1    ....*...3.....N.
	 160: d51a 0000 00e4 0504 06ff 0000 0000 0000    ................
	 176: 0000 15d8 2a97 61c4 fa98 3b63 14d0 c5cb    ....*.a...;c....
	 192: 59ee 8848 1638 12bc 486e d73a 8b1e d704    Y..H.8..Hn.:....
	 208: 74e2 65e6 e036 6847 32e8 d2c8 a100 655b    t.e..6hG2.....e[
	 224: df06 73df 78d2 af8a 7850 193c a0bc 2147    ..s.x...xP.<..!G
	 240: 6073 7dcf 3038 cfbb 95d4 5f35 489c 65eb    `s}.08...._5H.e.
	 256: 1e54 3572 60c8 9b1e 78c8 f47a ac25 e8be    .T5r`...x..z.%..
	 272: ddd5 c104 8067 cf6a ca03 1327 c14d e5dd    .....g.j...'.M..
	 288: 0f06 2dac bac9 d689 7536 e391 0e3f 14dd    ..-.....u6...?..
	 304: 2f7b 33d1 231e 3b7b 0de5 5ee2 c28f cb54    /{3.#.;{..^....T
	 320: a2e0 2456 1ffa ddf0 c37f 42bf 252b 1667    ..$V......B.%+.g
	 336: 02c2 1fe3 b19d 0d7b 94a2 4e50 748b 5935    .......{..NPt.Y5
	 352: 890b 746c deb2 5744 97a4 4c07 83e4 5377    ..tl..WD..L...Sw
	 368: 4ca4 75e4 8081 f196 6f01 63fd 4e56 bee9    L.u.....o.c.NV..
	 384: 5510 c21a 6b6a 2d63 c326                   U...kj-c.&

172.17.0.201 -> 172.17.0.101 RPC RPCSEC_GSS R NFS ver(4) proc(1) (data encrypted)

	   0: 0208 20ea 4c3d 0208 20e4 7813 0800 4500    .. .L=.. .x...E.
	  16: 01d0 f57e 4000 4006 ea58 ac11 00c9 ac11    ...~@.@..X......
	  32: 0065 0801 03ff 2d1f 25fc 4667 940e 8018    .e....-.%.Fg....
	  48: 8026 8344 0000 0101 080a 07b7 8b1b 07b7    .&.D............
	  64: 8b1b 8000 0198 6e7d 0f68 0000 0001 0000    ......n}.h......
	  80: 0000 0000 0006 0000 001c 0404 05ff ffff    ................
	  96: ffff 0000 0000 22a9 1433 c781 6e9e 8ed8    ......"..3..n...
	 112: e6cc aa86 e4d9 0000 0000 0000 0160 0504    .............`..
	 128: 07ff 0000 0000 0000 0000 22a9 1434 68c0    .........."..4h.
	 144: e008 d7e8 cca4 af88 da90 2b45 dc13 57b9    ..........+E..W.
	 160: 3a0a e3f8 5a98 fddb 5039 62bc 1858 ecd5    :...Z...P9b..X..
	 176: 0f5c fcd6 a150 7bf0 0782 d337 8cf6 8de1    .\...P{....7....
	 192: 5e81 481f b921 9054 d74a 0160 e9a4 0522    ^.H..!.T.J.`..."
	 208: 8d85 f55d 9576 f819 6515 c010 8d22 d0a4    ...].v..e...."..
	 224: e685 0b00 ebd9 cb9b 4079 dcd1 1195 5690    ........@y....V.
	 240: 9d07 846b a8e0 f022 c33d 7412 5065 3bc5    ...k...".=t.Pe;.
	 256: 0be5 7f98 9cb5 f5cb 8452 aa0a dfa7 cfb3    .........R......
	 272: e9eb a607 03a8 59c9 dc62 903c b289 dd13    ......Y..b.<....
	 288: b20f 612d 1603 c335 2705 61ce af13 b792    ..a-...5'.a.....
	 304: 442e 5a19 59fb d867 377e 34f3 b43d f8e3    D.Z.Y..g7~4..=..
	 320: ff0a 2937 d04c 1b22 0213 5227 57f1 ba26    ..)7.L."..R'W..&
	 336: 44e0 5e52 2f79 41d9 a494 cee6 bd76 f8e0    D.^R/yA......v..
	 352: ecd1 4b98 0e91 7b09 321e 97b1 26ef 3cdc    ..K...{.2...&.<.
	 368: 7211 7ae3 b71c 3bb0 c1b0 2e91 93e2 2b37    r.z...;.......+7
	 384: a1de 76ca f736 70c4 4987 b39f 71e9 736f    ..v..6p.I...q.so
	 400: fc6e 433e 5f2f f283 06b6 cf1b 96f8 b447    .nC>_/.........G
	 416: af39 1d95 6fe7 4173 e554 2d77 c9b8 df88    .9..o.As.T-w....
	 432: 48d2 843e 67cb 54a2 93c8 8bad b24c 1e40    H..>g.T......L.@
	 448: 64aa 7f75 5fec a0c6 4d58 de19 ec68 25d3    d..u_...MX...h%.
	 464: af93 6f26 e12f 180b f0c0 87b6 7df6         ..o&./......}.

...

172.17.0.101 -> 172.17.0.201 NFS R CB_NULL

	   0: 0208 20e4 7813 0208 20ea 4c3d 0800 4500    .. .x... .L=..E.
	  16: 0050 ea7c 4000 4006 0000 ac11 0065 ac11    .P.|@.@......e..
	  32: 00c9 b385 ed12 c833 5144 9614 5a3c 8018    .......3QD..Z<..
	  48: 8026 5993 0000 0101 080a 07b7 8b1d 07b7    .&Y.............
	  64: 8b1a 8000 0018 627d 0f68 0000 0001 0000    ......b}.h......
	  80: 0000 0000 0000 0000 0000 0000 0000         ..............

172.17.0.201 -> 172.17.0.101 TCP D=45957 S=60690 Ack=3358806368 Seq=2517916220 Len=0 Win=32806 Options=<nop,nop,tstamp 129469213 129469213>

	   0: 0208 20ea 4c3d 0208 20e4 7813 0800 4500    .. .L=.. .x...E.
	  16: 0034 f58a 4000 4006 ebe8 ac11 00c9 ac11    .4..@.@.........
	  32: 0065 ed12 b385 9614 5a3c c833 5160 8010    .e......Z<.3Q`..
	  48: 8026 cd1f 0000 0101 080a 07b7 8b1d 07b7    .&..............
	  64: 8b1d                                       ..

172.17.0.101 -> 172.17.0.201 TCP D=2049 S=1023 Ack=757019588 Seq=1181196406 Len=0 Win=32806 Options=<nop,nop,tstamp 129469216 129469211>

	   0: 0208 20e4 7813 0208 20ea 4c3d 0800 4500    .. .x... .L=..E.
	  16: 0034 ea7d 4000 4006 0000 ac11 0065 ac11    .4.}@.@......e..
	  32: 00c9 03ff 0801 4667 a076 2d1f 33c4 8010    ......Fg.v-.3...
	  48: 8026 5977 0000 0101 080a 07b7 8b20 07b7    .&Yw......... ..
	  64: 8b1b                                       ..


root@host1:~#

And now using the clear text mount point.

root@host1:~# snoop -d net0 -r -x 0 host zfs1 &
[1] 21593
root@host1:~# Using device net0 (promiscuous mode)

root@host1:~# cat /net/zfs1/clear/fox.txt
The quick brown fox jumps over the lazy dog.
...

172.17.0.201 -> 172.17.0.101 NFS R 4 (read        ) NFS4_OK PUTFH NFS4_OK READ NFS4_OK (45 bytes) EOF

	   0: 0208 20ea 4c3d 0208 20e4 7813 0800 4500    .. .L=.. .x...E.
	  16: 00b0 f594 4000 4006 eb62 ac11 00c9 ac11    ....@.@..b......
	  32: 0065 0801 03ff 2d1f 3ba8 4667 a8d2 8018    .e....-.;.Fg....
	  48: 8026 f4c5 0000 0101 080a 07b7 9377 07b7    .&...........w..
	  64: 9377 8000 0078 917d 0f68 0000 0001 0000    .w...x.}.h......
	  80: 0000 0000 0000 0000 0000 0000 0000 0000    ................
	  96: 0000 0000 000c 7265 6164 2020 2020 2020    ......read     
	 112: 2020 0000 0002 0000 0016 0000 0000 0000      ..............
	 128: 0019 0000 0000 0000 0001 0000 002d 5468    .............-Th
	 144: 6520 7175 6963 6b20 6272 6f77 6e20 666f    e quick brown fo
	 160: 7820 6a75 6d70 7320 6f76 6572 2074 6865    x jumps over the
	 176: 206c 617a 7920 646f 672e 0a00 0000          lazy dog.....

...

172.17.0.101 -> 172.17.0.201 TCP D=2049 S=1023 Ack=757021992 Seq=1181198770 Len=0 Win=32806 Options=<nop,nop,tstamp 129471358 129471351>

	   0: 0208 20e4 7813 0208 20ea 4c3d 0800 4500    .. .x... .L=..E.
	  16: 0034 ea89 4000 4006 0000 ac11 0065 ac11    .4..@.@......e..
	  32: 00c9 03ff 0801 4667 a9b2 2d1f 3d28 8010    ......Fg..-.=(..
	  48: 8026 5977 0000 0101 080a 07b7 937e 07b7    .&Yw.........~..
	  64: 9377                                       .w


root@host1:~#

In both cases, because I let automounter time out and a new mount is initiated in each case, the are so many packets it is hard to know which is doing what. However, in the case of reading the file on /clear the "quick brown fox" text is clearly visibale. Your own tests and snoop output should make this difference very clear.

By default, the mounts use NFS version 4 (NFSv4). You can mount stating you want version 3. The results will be the same.

Additional NFS Client Configuration Options

root@host1:~# mount -o vers=3 zfs1:/secure /mnt
root@host1:~#

And as a reminder you can force mounts to use version 3 on either a client or a server using the sharectl(1M) command.

root@host1:~# sharectl get -p client_versmax nfs
client_versmax=4
root@host1:~#
root@host1:~# sharectl set -p client_versmax=3 nfs
root@host1:~# sharectl get -p client_versmax nfs
client_versmax=3
root@host1:~#

Summary and Next Step

This completes the Secure NFS setup. One option is to co-located the KDC and NFS server. Either go to Combining KDC and NFS Server or back to the introduction.

Secure NFS: Step 4--Combining the KDC and NFS Server

Combining the KDC and NFS Server

When I asked my customer about their availability requirements, they stated that they only need a few NFS clients with encrypted traffic. They would like to keep the setup simple, and therefore combine the KDC and NFS server. They are using Oracle Solaris Cluster for availability, and by putting both services in a single Solaris Zone, can meet their availability requirements with Oracle Solaris Cluster managing the Solaris Zone startup and failover.

So I looked into whether this is a good idea, and I was informed that this is fully supported and tested. They way to do this is to make the KDC a client of itself.

Making the KDC a Kerberos Client

root@kdc1:~# kclient -p /net/kdc1/share/krb5/kcprofile

Starting client setup

---------------------------------------------------

Setting up /etc/krb5/krb5.conf.

Copied /net/kdc1.steffentw.com/share/krb5/krb5.conf to /system/volatile/kclient/kclient-krb5conf.mmayyQ.
Obtaining TGT for kws/admin ...
Password for kws/admin@STEFFENTW.COM:
kinit:  no ktkt_warnd warning possible

nfs/kdc1.steffentw.com entry ADDED to KDC database.
nfs/kdc1.steffentw.com entry ADDED to keytab.

host/kdc1.steffentw.com entry already exists in KDC database.
host/kdc1.steffentw.com entry already present in keytab.
host/kdc1.steffentw.com entry ADDED to keytab.

---------------------------------------------------
Setup COMPLETE.

root@kdc1:~#
root@kdc1:~# klist -k
Keytab name: FILE:/etc/krb5/krb5.keytab
KVNO Principal
---- --------------------------------------------------------------------------
   3 host/kdc1.steffentw.com@STEFFENTW.COM
   3 host/kdc1.steffentw.com@STEFFENTW.COM
   3 host/kdc1.steffentw.com@STEFFENTW.COM
   3 host/kdc1.steffentw.com@STEFFENTW.COM
   2 nfs/kdc1.steffentw.com@STEFFENTW.COM
   2 nfs/kdc1.steffentw.com@STEFFENTW.COM
   2 nfs/kdc1.steffentw.com@STEFFENTW.COM
   2 nfs/kdc1.steffentw.com@STEFFENTW.COM
root@kdc1:~#

Creating Secured NFS Share

Then create a new mount point and put some data into it.

root@kdc1:~# zfs create -o mountpoint=/secure -o share.nfs=on -o share.nfs.sec=krb5p rpool/secure
root@kdc1:~#
root@kdc1:~# share
rpool_share     /share  nfs     sec=sys,rw     
rpool_secure    /secure nfs     sec=krb5p,rw   
root@kdc1:~#
root@kdc1:~# cp /net/zfs1/secure/fox.txt /secure/
root@kdc1:~#
root@kdc1:~# cat /secure/fox.txt
The quick brown fox jumps over the lazy dog.
root@kdc1:~#

Back on the client, read the file on the KDC, with snoop running to show data is encrypted. And since the maximum client version was set to version 3, the snoop shows that as well.

root@host1:~# snoop -d net0 -r host kdc1 &
[1] 21825
root@host1:~# Using device net0 (promiscuous mode)

root@host1:~# cat /net/kdc1/secure/fox.txt
The quick brown fox jumps over the lazy dog.
root@host1:~# 172.17.0.101 -> 172.17.0.251 TCP D=2049 S=1022 Syn Seq=597683294 Len=0 Win=32804 Options=<mss 1460,sackOK,tstamp 129789256 0,nop,wscale 5>
172.17.0.251 -> 172.17.0.101 TCP D=1022 S=2049 Syn Ack=597683295 Seq=1916087307 Len=0 Win=32806 Options=<sackOK,tstamp 129789256 129789256,mss 1460,nop,wscale 5>
172.17.0.101 -> 172.17.0.251 TCP D=2049 S=1022 Ack=1916087308 Seq=597683295 Len=0 Win=32806 Options=<nop,nop,tstamp 129789256 129789256>
172.17.0.101 -> 172.17.0.251 RPC RPCSEC_GSS C NFS ver(3) proc(1) (data encrypted)
172.17.0.251 -> 172.17.0.101 TCP D=1022 S=2049 Ack=597683495 Seq=1916087308 Len=0 Win=32806 Options=<nop,nop,tstamp 129789257 129789257>
172.17.0.251 -> 172.17.0.101 RPC RPCSEC_GSS R NFS ver(3) proc(1) (data encrypted)
172.17.0.101 -> 172.17.0.251 TCP D=2049 S=1022 Ack=1916087520 Seq=597683495 Len=0 Win=32806 Options=<nop,nop,tstamp 129789259 129789259>
172.17.0.101 -> 172.17.0.251 RPC RPCSEC_GSS C NFS ver(3) proc(4) (data encrypted)
172.17.0.251 -> 172.17.0.101 TCP D=1022 S=2049 Ack=597683699 Seq=1916087520 Len=0 Win=32806 Options=<nop,nop,tstamp 129789259 129789259>
172.17.0.251 -> 172.17.0.101 RPC RPCSEC_GSS R NFS ver(3) proc(4) (data encrypted)
172.17.0.101 -> 172.17.0.251 TCP D=2049 S=1022 Ack=1916087740 Seq=597683699 Len=0 Win=32806 Options=<nop,nop,tstamp 129789259 129789259>
172.17.0.101 -> 172.17.0.251 RPC RPCSEC_GSS C NFS ver(3) proc(1) (data encrypted)
172.17.0.251 -> 172.17.0.101 RPC RPCSEC_GSS R NFS ver(3) proc(1) (data encrypted)
172.17.0.101 -> 172.17.0.251 TCP D=2049 S=1022 Ack=1916087952 Seq=597683899 Len=0 Win=32806 Options=<nop,nop,tstamp 129789266 129789259>

root@host1:~#

Summary and Next Step

That is everything, I hope. Here you can quickly go back to the introduction.

Wednesday Feb 24, 2010

My thoughts on configuring zones with shared IP instances and the 'defrouter' parameter

An occasional call or email I receive has questions about routing issues when using Solaris Zones in the (default) shared IP Instance configuration. Everything works well when the non-global zones are on the same IP subnet (lets say 172.16.1.0/24) as the global zone. Routing gets a little tricky when the non-global zones are on a different subnet.

My general recommendation is to isolate. This means:

  • Separate subnets for the global zone (administration, backup) and the non-global zones (applications, data).
  • Separate data-links for the global and non-global zones.
    • The non-global zones can share a data-link
    • Non-global zones on different IP subnets use different data-links
Using separate data-links is not always possible. I was concerned whether this would actually work.

So I did some testing, and exchanged some emails because of a comment I made regarding PSARC/2008/057 and the automatic removal of a default route when the zone is halted.

Turns out I have been very restrictive in suggesting that the global and non-global zones not share a data-link. While I think that is a good administrative policy, to separate administrative and application traffic, it is not a requirement. It is OK to have the global zone and one or more non-global zones share the same data-link. However, if the non-global zones are to have different default routes, they must be on subnets that the global zone is not on.

My test case running Solaris 10 10/09 has the global zone on the 129.154.53.0/24 network and the non-global zone on the 172.16.27.0/24 network.

global# ifconfig -a
...
e1000g0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet 129.154.53.132 netmask ffffff00 broadcast 129.154.53.255
        ether 0:14:4f:ac:57:c4
e1000g0:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        zone shared1
        inet 172.16.27.27 netmask ffffff00 broadcast 172.16.27.255

global# zonecfg -z shared1 info net
net:
        address: 172.16.27.27/24
        physical: e1000g0
        defrouter: 172.16.27.16

The routing table as seen from both are:
global# netstat -rn

Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface
-------------------- -------------------- ----- ----- ---------- ---------
default              129.154.53.215       UG        1        123
default              172.16.27.16         UG        1          7 e1000g0
129.154.53.0         129.154.53.132       U         1         50 e1000g0
224.0.0.0            129.154.53.132       U         1          0 e1000g0
127.0.0.1            127.0.0.1            UH        3         80 lo0

shared1# netstat -rn

Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface
-------------------- -------------------- ----- ----- ---------- ---------
default              172.16.27.16         UG        1          7 e1000g0
172.16.27.0          172.16.27.27         U         1          3 e1000g0:1
224.0.0.0            172.16.27.27         U         1          0 e1000g0:1
127.0.0.1            127.0.0.1            UH        4         78 lo0:1
While the global zone shows both routes, only the default applying to its subnet will be used. And for traffic leaving the non-global zone, only its default will be used.

You may notice that the Interface for the global zone's default router is blank. That is because I have set the default route via /etc/defaultrouter. I noticed that if it is determined via the route discovery daemon, it will be listed as being on e1000g0! This does not affect the behavior, however it may be visually confusing, which is probably why I initially leaned towards saying to not share the data-link.

There are multiple ways to determining which route might be used, including ping(1M) and traceroute(1M). I like the output of the route get command.

global# route get 172.16.29.1
   route to: 172.16.29.1
destination: default
       mask: default
    gateway: 129.154.53.1
  interface: e1000g0
      flags: <UP,GATEWAY,DONE,STATIC>
 recvpipe  sendpipe  ssthresh    rtt,ms rttvar,ms  hopcount      mtu     expire
       0         0         0         0         0         0      1500         0

shared1# route get 172.16.28.1
   route to: 172.16.28.1
destination: default
       mask: default
    gateway: 172.16.27.16
  interface: e1000g0:1
      flags: <UP,GATEWAY,DONE,STATIC>
 recvpipe  sendpipe  ssthresh    rtt,ms rttvar,ms  hopcount      mtu     expire
       0         0         0         0         0         0      1500         0
This quickly shows which interfaces and IP addresses are being used. If there are multiple default routes, repeated invocations of this will show a rotation in the selection of the default routes.

Thanks to Erik Nordmark and Penny Cotten for their insights on this topic!

Steffen Weiberle

Thursday Apr 16, 2009

What happened to my packets? -- or -- Dual default routes and shared IP zones

I recently received a call from someone who has helped me out a lot on some performance issues (thanks, Jim Fiori), and I was glad to be able to return even a small part of those favors!

He had been contacted to help a customer who was ready to deploy a web application, and they were experiencing intermittent lack of connection to the web site. Interestingly, they were also using zones, a bunch of them (OK, a handful)--and so right up my alley.

The customer was running a multi-tiered web application on an x4600 (so Solaris on x86 as well!), with the web server, web router, and application tiers in different zones. They were using shared IP Instances, so all the network configuration was being done in the global zone.

Initially, we had to modify some configuration parameters, especially regarding default routes. Since the system was installed with Solaris 10 5/08 and had more recent patches, we could use the defrouter feature introduced in 10/08 to make setting up routes for the non-global zones a little easier. This was needed because the global zone was using only one NIC, and it was not going to be on the networks that the non-global zones were on.

What made the configuration a little unique was that the web server needs a default router to the Internet, while the application server needs a route to other systems behind a different router. Individually, everything is fine. However, the web1 zone also needs to be on the network that the application and web router are on, so it ends up having two interfaces.

Lets look at web1 when only it is running.

web1# ifconfig -a4
lo0:1: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000
bge1:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
        inet 172.16.1.41 netmask ffffff00 broadcast 172.16.1.255
bge2:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4
        inet 192.168.51.41 netmask ffffff00 broadcast 192.168.51.255
web1# netstat -rn
Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface
-------------------- -------------------- ----- ----- ---------- ---------
default              172.16.1.1           UG        1          0 bge1
172.16.1.0           172.16.1.41          U         1          0 bge1:1
192.168.51.0         192.168.51.41        U         1          0 bge2:1
224.0.0.0            172.16.1.41          U         1          0 bge1:1
127.0.0.1            127.0.0.1            UH        5         34 lo0:1

The zone is on two interface, bge1 and bge2, and has a default route that uses bge1. However, when zone app1 is running, there is a second default route, on bge2. The same is true if app2 or odr are running. Note that these three zones are only on bge2.

app1# ifconfig -a4
lo0:1: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000
bge2:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4
        inet 192.168.51.43 netmask ffffff00 broadcast 192.168.51.255
app1# netstat -rn
Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface
-------------------- -------------------- ----- ----- ---------- ---------
default              192.168.51.1         UG        1          0 bge2
192.168.51.0         192.168.51.43        U         1          0 bge2:1
224.0.0.0            192.168.51.43        U         1          0 bge2:1
127.0.0.1            127.0.0.1            UH        3         51 lo0:1

In the meantime, this is what happens in web1.

web1# netstat -rn

Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface
-------------------- -------------------- ----- ----- ---------- --------- 
default              192.168.51.1         UG        1          0 bge2
default              172.16.1.1           UG        1          0 bge1 
172.16.1.0           172.16.1.41          U         1          0 bge1:1
192.168.51.0         192.168.51.41        U         1          0 bge2:4
224.0.0.0            172.16.1.41          U         1          0 bge1:1
127.0.0.1            127.0.0.1            UH        6        132 lo0:4

With any of the other zones running, web1 now has two default routes. And it only happens in web1, as it is the only zone with its public facing data link bge1 and a shared data link (bge2).

Traffic to any system on either the 192.168.51.0 or 172.16.1.1 network will have no issues. Every time IP needs to determine a new path for a system not on either of those two networks, it will pick a route, and it will round-robin between the two default routes. Thus approximately half the time, connections will fail to establish, or possibly existing connections will not work if they have been idle for a while.

This is how IP is supposed to work, so there is technically nothing wrong. It is a features of zones and a shared IP Instance. [2009.06.23: For background on why IP works this way, see James' blog].

The only problem is that this is not what the customer wants!

One option would be to force all traffic between the web and application tier out the bge1 interface, putting it on the wire. This may not be desirable for security reasons, and introduces latencies since traffic now goes on the wire. Another option would be to use exclusive IP Instances for the web servers. For each web zone, and this example only has one, it would required two additional data links (NICs). That would add up. Also, this configuration is targeted to be used with Solaris Cluster's scalable services, and those must be in shared IP Instance zones. Hummm....as I like to say.

We didn't know about the shared IP Instance restriction of Solaris Cluster, and as the customer was considering how they were going to add additional NICs to all the systems, something slowly developed in my mind. How about creating a shared, dummy network between the web and application tier? They had one spare NIC, and with shared IP it does not even need to be connected to a switch port, since IP will loop all traffic back anyway!

The more I thought about it, the more I liked it, and I could not see anything wrong with it. At least not technically as I understood Solaris. Operationally, for the customer, it might be a little awkward.

Here is what I was thinking of...

With this configuration the web1 zone has a default router only to the Internet and it can reach odr, and if necessary, app1 and app2, directly via the new network. And app1 and app2 only have a single default route to get to the Intranet. The nice thing is that bge3 does not even need to be up. That is visible with ifconfig output, where bge3 is not showing a RUNNING flag, which indicates the port is not connected (or in my case has been disabled on the switch).

global# ifconfig -a4
...
bge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet 129.154.53.125 netmask ffffff00 broadcast 129.154.53.255
        ether 0:3:ba:e3:42:8b
bge1: flags=1000842<BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
        inet 0.0.0.0 netmask 0
        ether 0:3:ba:e3:42:8c
bge2: flags=1000842<BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4
        inet 0.0.0.0 netmask 0
        ether 0:3:ba:e3:42:8d 
bge3: flags=1000802<BROADCAST,MULTICAST,IPv4> mtu 1500 index 5 
        inet 0.0.0.0 netmask 0
        ether 0:3:ba:e3:42:8e
...
And within web1 there is now only one default route.
web1# netstat -rn

Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface
-------------------- -------------------- ----- ----- ---------- --------- 
default              172.16.1.1           UG        1         17 bge1 
172.16.1.0           172.16.1.41          U         1          2 bge1:1
192.168.52.0         192.168.52.41        U         1          2 bge3:1
224.0.0.0            172.16.1.41          U         1          0 bge1:1
127.0.0.1            127.0.0.1            UH        4        120 lo0:1
In the customer's case, multiple systems were being used, so the private networks were connected together so that a web zone on one system could access an odr zone on another. I am showing the simple, single system case since it is so convenient.

If I were using Solaris Express Community Edition (SX-CE) or OpenSolaris 2009.06 Developer Builds, with the Crossbow bits and virtual NICs (VNICs) available, I wouldn't even have needed to use that physical interface. Both are available here.

I hope this trick might help others out in the future.

Steffen

Tuesday Apr 14, 2009

Using IPMP with link based failure detection

Solaris has had a feature to increase network availability called IP Multipathing (IPMP). Initially it required a test address on every data link in an IPMP group, where the test addresses were used as the source IP address to probe network elements for path availability. One of the benefits of probe-based failure detection is that it can extend beyond the directly connected link(s), and verify paths through the attached switch(es) to what typically is a router or other redundant element to provide available services.

Having one IP address (whether a public or a private, non routable) per data link and also the separate address(es) for the application(s) turns out to be a lot of addresses to allocate and administer. And since the default of five probes spaced two seconds apart meant a failure would take at least ten (10) seconds to be detected, something more was needed.

So in the Solaris 9 timeframe the ability to also do link based failure detection was delivered. It requires specific NICs whose driver has the ability to notify the system that a link has failed. The Introduction to IPMP in the Solaris 10 Systems Administrators Guide on IP Services lists the NICs that support link state notification. Solaris 10 supports configuring IPMP with only link based failure detection.

global# more /etc/hostname.bge[12]
::::::::::::::
/etc/hostname.bge1
::::::::::::::
10.1.14.140/26 group ipmp1 up
::::::::::::::
/etc/hostname.bge2
::::::::::::::
group ipmp1 standby up
On system boot, there will be an indication on the console that since no test addresses are defined, probe-based failure detection is disabled.

Apr 10 10:57:20 in.mpathd[168]: No test address configured on interface bge2; disabling probe-based failure detection on it
Apr 10 10:57:20 in.mpathd[168]: No test address configured on interface bge1; disabling probe-based failure detection on it
Looking at the interfaces configured,
global# ifconfig -a4
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000
bge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet 129.154.53.125 netmask ffffff00 broadcast 129.154.53.255
        ether 0:3:ba:e3:42:8b
bge1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
        inet 10.1.14.140 netmask ffffffc0 broadcast 10.1.14.191
        groupname ipmp1
        ether 0:3:ba:e3:42:8c
bge1:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
        inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
bge2: flags=69000842<BROADCAST,RUNNING,MULTICAST,IPv4,NOFAILOVER,STANDBY,INACTIVE> mtu 0 index 4
        inet 0.0.0.0 netmask 0
        groupname ipmp1
        ether 0:3:ba:e3:42:8d
you will notice that two of the three interfaces have no address (0.0.0.0). Also, the data address is on a physical interface on bge1. At the same time bge2 has the 0.0.0.0 address. On the failure of bge1,
Apr 10 14:34:53 global bge: NOTICE: bge1: link down
Apr 10 14:34:53 global in.mpathd[168]: The link has gone down on bge1
Apr 10 14:34:53 global in.mpathd[168]: NIC failure detected on bge1 of group ipmp1
Apr 10 14:34:53 global in.mpathd[168]: Successfully failed over from NIC bge1 to NIC bge2


global# ifconfig -a4
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000
bge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet 129.154.53.125 netmask ffffff00 broadcast 129.154.53.255
        ether 0:3:ba:e3:42:8b
bge1: flags=19000802<BROADCAST,MULTICAST,IPv4,NOFAILOVER,FAILED> mtu 0 index 3
        inet 0.0.0.0 netmask 0
        groupname ipmp1
        ether 0:3:ba:e3:42:8c
bge2: flags=21000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,STANDBY> mtu 1500 index 4
        inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
        groupname ipmp1
        ether 0:3:ba:e3:42:8d
bge2:1: flags=21000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,STANDBY> mtu 1500 index 4
        inet 10.1.14.140 netmask ffffffc0 broadcast 10.1.14.191
the data address is migrated onto bge2:1. I find this a little confusing. However, I don't know any way around it on Solaris 10. The IPMP Re-architecture makes this a lot easier!

Using Probe-based IPMP with non-global zones

Configuring a shared IP Instance non-global zone and utilizing IPMP managed in the global zone is very easy.

The IPMP configuration is very simple. Interface bge1 is active, and bge2 is in stand-by mode.

global# more /etc/hostname.bge[12]
::::::::::::::
/etc/hostname.bge1
::::::::::::::
group ipmp1 up
::::::::::::::
/etc/hostname.bge2
::::::::::::::
group ipmp1 standby up
My zone configuration is:
global# zonecfg -z zone1 info
zonename: zone1
zonepath: /zones/zone1
brand: native
autoboot: false
bootargs:
pool:
limitpriv:
scheduling-class:
ip-type: shared
inherit-pkg-dir:
        dir: /lib
inherit-pkg-dir:
        dir: /platform
inherit-pkg-dir:
        dir: /sbin
inherit-pkg-dir:
        dir: /usr
net:
        address: 10.1.14.141/26
        physical: bge1
Prior to booting, the network configuration is:
global# ifconfig -a4
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000
lo0:1: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        zone zone1
        inet 127.0.0.1 netmask ff000000
bge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet 129.154.53.125 netmask ffffff00 broadcast 129.154.53.255
        ether 0:3:ba:e3:42:8b
bge1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
        inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
        groupname ipmp1
        ether 0:3:ba:e3:42:8c
bge2: flags=21000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,STANDBY> mtu 1500 index 4
        inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
        groupname ipmp1
        ether 0:3:ba:e3:42:8d
After booting, the network looks like this:
global# ifconfig -a4
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000
lo0:1: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        zone zone1
        inet 127.0.0.1 netmask ff000000
bge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet 129.154.53.125 netmask ffffff00 broadcast 129.154.53.255
        ether 0:3:ba:e3:42:8b
bge1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
        inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
        groupname ipmp1
        ether 0:3:ba:e3:42:8c
bge1:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
        zone zone1
        inet 10.1.14.141 netmask ffffffc0 broadcast 10.1.14.191
bge2: flags=21000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,STANDBY> mtu 1500 index 4
        inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
        groupname ipmp1
        ether 0:3:ba:e3:42:8d

So a simple case for the use of IPMP, without the need for test addresses! Other IPMP configurations, such as more than two data links, or active-active, are also supported with link based failure detection. The more links involved, the more test addresses are saved with link based failure detection. Since writing this entry I was involved in a customer configuration where this is saving several hundred IP address and their management (such as avoiding duplicate address). That customer is willing to forgo the benefit of probes testing past the local switch port.

Steffen

Tuesday Jan 13, 2009

Using zonecfg defrouter with shared-IP zones

[Update to IPMP testing 2009.01.20]

[Minor update 2009.01.14]

When running Solaris Zones in a shared-IP configuration, all network configurations are determined by how the zone is configured using zonecfg(1M) or by what the global zone's IP determines things should be (such as routes). This has caused some trouble in situations where zones are on different subnets, and especially if the global zone is not on the subnet(s) the non-global zones are on. While exclusive IP Instances were delivered to help address these cases, using exclusive IP Instances requires a data link per zone, and if running a large number of zones there may not be enough data links available.

With Solaris 10 10/08 (Update 6), an additional network configuration parameter is available for shared-IP zones. This is the default router (defrouter) optional parameter.

Using the defrouter parameter, it is possible to set which router to use for traffic leaving the zone. In the global zone, default router entries are created the first time the zone is booted. Note that the entries are not deleted when the zone is halted.

The defrouter property looks like this for a zone with it configured.

global# zonecfg -z shared1 info net
net:
        address: 10.1.14.141/26
        physical: bge1
        defrouter: 10.1.14.129
And it looks like this if it is not set.
global# zonecfg -z shared1 info net
net:
        address: 10.1.14.141/26
        physical: bge1
        defrouter not specified
So I have run a variety of configurations, and some thing I observed are as follows. (Most of the configurations used a separate interface for the global zone (bge0) than for the non-global zones (bge1 and bge2). IPMP is not being used in these configurations. A comment on that at the end.) The [#] indicate examples in the outputs that follow.
  • A default route entry is create for the NIC [1] on which the zone is configured when the zone is booted. [2]
  • Entries are not deleted when a zone is halted. They persist until manually removed[3] or a reboot of the global zone.
  • It is possible to have the same default router configured for multiple zones. [4]
  • It is possible to have the same default router listed on multiple interfaces. \* [5]
  • It is possible to have multiple default routers on the same interface, even on different IP subnets. [6]
  • The interface used for outbound traffic is the one the zone is assigned to. [7]
  • It is sufficient to plumb the interface for the non-global zones in the global zone (thus it has 0.0.0.0 as its IP address in the global zone). [8]
  • The physical interface can be down in the global zone. [9]
  • If only one interface is used, and different subnets for the global and non-global zones are configured, routing works when setting defrouter [10] and does not work if it is not set.
The most interesting thing I noticed was that although two non-global zones may be on the same IP subnet, if they are configured on different interfaces, the traffic leaves the system on the interface that the zone is configured to be on. This is not the case typically when using shared IP and also having an IP address for the subnet in the global zone.

\* Note: Having two interfaces on the same IP subnet without configuring IP Multipathing (IPMP) may not be a supported configuration. I am looking for documentation that states this one way or another. [2009.01.14]

Examples

1. Single Zone, Single Interface--The Basics

Create a single non-global zone.
global# netstat -rn

Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface
-------------------- -------------------- ----- ----- ---------- ---------
default              139.164.63.215       UG        1          2 bge0
139.164.63.0         139.164.63.125       U         1          1 bge0
224.0.0.0            139.164.63.125       U         1          0 bge0
127.0.0.1            127.0.0.1            UH        1         42 lo0

global# zonecfg -z shared1 info net
net:
        address: 10.1.14.141/26
        physical: bge1
        defrouter: 10.1.14.129

global# zoneadm -z shared1 boot [2]

global# netstat -rn

Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface
-------------------- -------------------- ----- ----- ---------- ---------
default              139.164.63.215       UG        1          2 bge0
default              10.1.14.129          UG        1          0 bge1 [1]
139.164.63.0         139.164.63.125       U         1          1 bge0
224.0.0.0            139.164.63.125       U         1          0 bge0
127.0.0.1            127.0.0.1            UH        1         42 lo0

global# zoneadm -z shared1 halt

global# zoneadm list -v
  ID NAME             STATUS     PATH                           BRAND    IP
   0 global           running    /                              native   shared

global# netstat -rn

Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface
-------------------- -------------------- ----- ----- ---------- ---------
default              10.1.14.129          UG        1          0 bge1
default              139.164.63.215       UG        1          1 bge0
139.164.63.0         139.164.63.125       U         1          1 bge0
224.0.0.0            139.164.63.125       U         1          0 bge0
127.0.0.1            127.0.0.1            UH        1         42 lo0

global# route delete default 10.1.14.129 [3]
delete net default: gateway 10.1.14.129

global# netstat -rn

Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface
-------------------- -------------------- ----- ----- ---------- ---------
default              139.164.63.215       UG        1          1 bge0
139.164.63.0         139.164.63.125       U         1          1 bge0
224.0.0.0            139.164.63.125       U         1          0 bge0
127.0.0.1            127.0.0.1            UH        1         42 lo0

2. Multiple Interfaces, Same Default Router

Three zones, where two use bge1 and the third uses bge2. All use the same default router.
global# netstat -rn

Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface
-------------------- -------------------- ----- ----- ---------- ---------
default              139.164.63.215       UG        1          1 bge0
139.164.63.0         139.164.63.125       U         1          1 bge0
224.0.0.0            139.164.63.125       U         1          0 bge0
127.0.0.1            127.0.0.1            UH        1         42 lo0

global# zonecfg -z shared1 info net
net:
        address: 10.1.14.141/26
        physical: bge1
        defrouter: 10.1.14.129 [4]

global# zonecfg -z shared2 info net
net:
        address: 10.1.14.142/26
        physical: bge1
        defrouter: 10.1.14.129 [4]

global# zonecfg -z shared3 info net
net:
        address: 10.1.14.143/26
        physical: bge2
        defrouter: 10.1.14.129 [5]

global# zoneadm -z shared1 boot

global# zoneadm -z shared2 boot

global# zoneadm -z shared3 boot

global# netstat -rn

Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface
-------------------- -------------------- ----- ----- ---------- ---------
default              10.1.14.129          UG        1          0 bge1 [4]
default              139.164.63.215       UG        1          1 bge0
default              10.1.14.129          UG        1          2 bge2 [5]
139.164.63.0         139.164.63.125       U         1          1 bge0
224.0.0.0            139.164.63.125       U         1          0 bge0
127.0.0.1            127.0.0.1            UH        1         42 lo0

global# zoneadm list -v
  ID NAME             STATUS     PATH                           BRAND    IP
   0 global           running    /                              native   shared
   3 shared1          running    /zones/shared1                 native   shared
   4 shared2          running    /zones/shared2                 native   shared
   5 shared3          running    /zones/shared3                 native   shared

global# ifconfig -a4
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000
lo0:1: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        zone shared1
        inet 127.0.0.1 netmask ff000000
lo0:2: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        zone shared2
        inet 127.0.0.1 netmask ff000000
lo0:3: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        zone shared3
        inet 127.0.0.1 netmask ff000000
bge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet 139.164.63.125 netmask ffffff00 broadcast 139.164.63.255
        ether 0:3:ba:e3:42:8b
bge1: flags=1000842<BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
        inet 0.0.0.0 netmask 0
        ether 0:3:ba:e3:42:8c
bge1:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
        zone shared1
        inet 10.1.14.141 netmask ffffffc0 broadcast 10.1.14.191
bge1:2: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
        zone shared2
        inet 10.1.14.142 netmask ffffffc0 broadcast 10.1.14.191
bge2: flags=1000842<BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4
        inet 0.0.0.0 netmask 0
        ether 0:3:ba:e3:42:8d
bge2:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4
        zone shared3
        inet 10.1.14.143 netmask ffffffc0 broadcast 10.1.14.191

3. Multiple Subnets

Add another zone, using bge2 and on a different subnet.
global# zonecfg -z shared4 info net
net:
        address: 192.168.16.144/24
        physical: bge2
        defrouter: 192.168.16.129

global# zoneadm -z shared4 boot

global# netstat -rn

Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface
-------------------- -------------------- ----- ----- ---------- ---------
default              10.1.14.129          UG        1          0 bge1
default              10.1.14.129          UG        1          4 bge2
default              139.164.63.215       UG        1          3 bge0
default              192.168.16.129       UG        1          0 bge2 [6]
139.164.63.0         139.164.63.125       U         1          4 bge0
224.0.0.0            139.164.63.125       U         1          0 bge0
127.0.0.1

4. Interface Usage

Issue some pings within the non-global zones and see which network interfaces are used. From the global zone, I issue a ping to a remote system (on the same network as the global zone (139.164.63.0), and see which interfaces are being used. [7]
global# zlogin shared1 ping 139.164.63.38
139.164.63.38 is alive

global# zlogin shared2 ping 139.164.63.38
139.164.63.38 is alive

global# zlogin shared3 ping 139.164.63.38
139.164.63.38 is alive

global# zlogin shared4 ping 139.164.63.38
139.164.63.38 is alive
This shows the pings originating from shared1 and shared2 going out on bge1.
global1# snoop -d bge1 icmp
Using device /dev/bge1 (promiscuous mode)
 10.1.14.141 -> 139.164.63.38 ICMP Echo request (ID: 4677 Sequence number: 0)
139.164.63.38 -> 10.1.14.141  ICMP Echo reply (ID: 4677 Sequence number: 0)
 10.1.14.142 -> 139.164.63.38 ICMP Echo request (ID: 4681 Sequence number: 0)
139.164.63.38 -> 10.1.14.142  ICMP Echo reply (ID: 4681 Sequence number: 0)
And this shows the pings originating from shared3 and shared4 going out on bge2.
global2# snoop -d bge2 icmp
Using device /dev/bge2 (promiscuous mode)
 10.1.14.143 -> 139.164.63.38 ICMP Echo request (ID: 4685 Sequence number: 0)
139.164.63.38 -> 10.1.14.143  ICMP Echo reply (ID: 4685 Sequence number: 0)
192.168.16.144 -> 139.164.63.38 ICMP Echo request (ID: 4689 Sequence number: 0)
139.164.63.38 -> 192.168.16.144 ICMP Echo reply (ID: 4689 Sequence number: 0)
Just to confirm where each zone is configured, here is the ifconfig output.
global# ifconfig -a4
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000
lo0:1: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        zone shared1
        inet 127.0.0.1 netmask ff000000
lo0:2: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        zone shared2
        inet 127.0.0.1 netmask ff000000
lo0:3: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        zone shared3
        inet 127.0.0.1 netmask ff000000
lo0:4: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        zone shared4
        inet 127.0.0.1 netmask ff000000
bge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet 139.164.63.125 netmask ffffff00 broadcast 139.164.63.255
        ether 0:3:ba:e3:42:8b
bge1: flags=1000842<BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3 [9]
        inet 0.0.0.0 netmask 0 [8]
        ether 0:3:ba:e3:42:8c
bge1:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
        zone shared1
        inet 10.1.14.141 netmask ffffffc0 broadcast 10.1.14.191
bge1:2: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
        zone shared2
        inet 10.1.14.142 netmask ffffffc0 broadcast 10.1.14.191
bge2: flags=1000842<BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4
        inet 0.0.0.0 netmask 0 [8]
        ether 0:3:ba:e3:42:8d
bge2:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4
        zone shared3
        inet 10.1.14.143 netmask ffffffc0 broadcast 10.1.14.191
bge2:2: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4
        zone shared4
        inet 192.168.16.144 netmask ffffff00 broadcast 192.168.16.255

5. Using a Single Interface

Only using bge0 and using different subnets for the global and non-global zones. [10]

Before booting the zone.

global# netstat -nr

Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface
-------------------- -------------------- ----- ----- ---------- ---------
default              139.164.63.215       UG        1          2 bge0
139.164.63.0         139.164.63.125       U         1          2 bge0
224.0.0.0            139.164.63.125       U         1          0 bge0
127.0.0.1            127.0.0.1            UH        1         42 lo0

global# zonecfg -z shared17 info net
net:
        address: 192.168.17.147/24
        physical: bge0
        defrouter: 192.168.17.16

global# zoneadm -z shared17 boot
Once the zone is booted, netstat shows both default routes, and a ping from the zone works.
global# netstat -rn

Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface
-------------------- -------------------- ----- ----- ---------- ---------
default              139.164.63.215       UG        1          2 bge0
default              192.168.17.16        UG        1          0 bge0
139.164.63.0         139.164.63.125       U         1          2 bge0
224.0.0.0            139.164.63.125       U         1          0 bge0
127.0.0.1            127.0.0.1            UH        1         42 lo0

global# ifconfig -a4
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000
lo0:1: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        zone shared17
        inet 127.0.0.1 netmask ff000000
bge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet 139.164.63.125 netmask ffffff00 broadcast 139.164.63.255
        ether 0:3:ba:e3:42:8b
bge0:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        zone shared17
        inet 192.168.17.147 netmask ffffff00 broadcast 192.168.17.255

global# zlogin shared17 ping 139.164.63.38
139.164.63.38 is alive

IP Multipathing (IPMP)

I did some testing with IPMP and similar examples as above. At this time the combination of IPMP and the defrouter configuration does not work. I have filed bug 6792116 to have this looked at.

[Updated 2009.01.20] After some addtional testing, especially with test addresses and probe based failure detection, I have seen IPMP work well only when zones are configured such that at least one zone is on each NIC in an IPMP group, including a standby NIC. For example, if you have two NICs, bge1 and bge2, at least one zone must be configured on bge1 and at least one on bge2. This is even the case when one of the NICs is in failed mode when the system or zone(s) boot. It turns out that the default route is added when the zone boot, and there is no later check for default route requirements as a zone is moved from one NIC to another based on IPMP failover or failback. Thus, I would recommend not using defrouter and IPMP together until the conbination is confirmed to work.

If this is important for your deployments, please add a service record to change request 6792116 and work with your service provide to have this addressed. Please also note that this works well with the IPMP Re-architecture coming soon to OpenSolaris.

Thursday Feb 14, 2008

Network Virtualization and Resource Control--Crossbow pre-Beta

The pre-beta bits and updated material for Project Crossbow have been posted to the opensolaris.org web site. If splitting up a NIC into several virtual NICs, limiiting network bandwidth, allocating CPUs to specific network traffic, faster datagram forwarding, or enhance visibility of what your network traffic looks like, check it out.

The code is available as a customized Nevada build 81 image, or you can install the BFU bits on top of an existing build 81 install. It may work with a slightly older or newer build (I did some testing with build 82), but that has not been fully tested.

Plans are to put the features into Nevada after the beta period and your feedback.

Thanks to the engineering for all the effort in getting this out! Many of my customers have been waiting for this to become available.

Patches for Using IP Instances with ce NICs are Available

The [Solaris 10] patches to be able to use IP Instances with the Cassini ethernet interface, known as ce, are available on sunsolve.sun.com for Solaris 10 users with a maintenance contract or subscription. (This is for Solaris 10 8/07, or a prior update patched to that level. These patches are included in Solaris 10 5/08, and also in patch clusters or bundles delivered at or around the same time, and since then.)

The SPARC patches are:

  • 137042-01 SunOS 5.10: zoneadmd patch
  • 118777-12 SunOS 5.10: Sun GigaSwift Ethernet 1.0 driver patch

The x86 patches are:

  • 137043-01 SunOS 5.10_x86: zoneadmd patch
  • 118778-11 SunOS 5.10_x86: Sun GigaSwift Ethernet 1.0 driver patch

I have not been able to try out the released patches myself, yet.

Steffen

Wednesday Jan 30, 2008

IP Instances with ce NICs patches are in progress!

The patches for the ON (OS and networking) part of the changes to allow IP Instances to work with the ce NICs (CR 6616075) are in progress. The patch numbers will be:

137042-01 (SPARC)
137043-01 (i386, x86, x64)

The patches should be available in about two weeks, after final internal and customre testing. If you have a service contract, you can get a temporary T-patch as interim relief, with all the caveats of a T-patch. Folks with an escalation should already have been notified. The fix will also be delivered in the next update of Solaris 10. It did not make the Beta of that update (Update 5), however. Don't forget, you also need the ce patch:

118777-12 (SPARC)
118778-11 (i386, x86, x64)

Happy IP-Instancing with ce!!

Thursday Dec 20, 2007

One Step Closer to IP Instances with ce

With the availability of Solaris Nevada build 80 [1], the ability to use IP Instances with the GigaSwift line of NICs and the ce driver becomes possible. The fix for CR 6616075 to zoneadmd(1M) has been integrated into the OpenSolaris code base and is available in build 80. The necessary fix to the ce driver, tracked in CR 6606507, has already been delivered. With this combination, a zone can have an exclusive IP Instance using a ce-based link.

Zone configuration information:

global# zonecfg -z ce1 info net
net:
        address not specified
        physical: ce1
global#

And the view from the non-global zone:

ce1# zonename
ce1
ce1# cat /etc/release
                  Solaris Express Community Edition snv_80 SPARC
           Copyright 2008 Sun Microsystems, Inc.  All Rights Reserved.
                        Use is subject to license terms.
                           Assembled 17 December 2007
ce1# ifconfig -a
lo0: flags=2001000849 mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000
ce1: flags=1000843 mtu 1500 index 2
        inet 192.168.200.153 netmask ffffff00 broadcast 192.168.200.255
        ether 0:3:ba:68:1d:5f
lo0: flags=2002000849 mtu 8252 index 1
        inet6 ::1/128
ce1#

More when the soak time in Nevada is complete and the backport to Solaris 10 is available.

Thanks to the engineers who put energy into these fixes!

Happy Holidays!

Steffen

[1] As of 20 December 2007, build 80 is available within Sun only. Availability on opensolaris.org will be announced on opensolaris-announce@opensolaris.org.

Wednesday Dec 05, 2007

More good news for IP Instances

Continuing progress on the use of IP Instances on the full line of SPARC systems. The e1000g Intel PCI-X Gigabit Ethernet UTP and MMF adapters are now supported on the Sun Fire UltraSPARC servers. The NICs are:
  • x7285a - Sun PCI-X Dual GigE UTP Low Profile. RoHS-6 compliant
  • x7286a - Sun PCI-X GigE MMF Low Profile, RoHS-6 compliant
The NICs are supported on the V490, V890, E2900, E4900, E6900, E20K, and E25K systems. This is an alternative for those waiting for the GigaSwift (ce) NIC to be supported, or who don't need quad-port cards. Since the driver used is the e1000g, which is a GLDv3 driver, full support for IP Instances is available using these cards.

Saturday May 12, 2007

Network performance differences within an IP Instance vs. across IP Instances

When consolidating or co-locating multiple applications on the same system, inter-application network typically stays within the system, since the shared IP in the kernel recognizes that the destination address is on the same system, and thus loops it back up the stack without ever putting the data on a physical network. This has introduced some challenges for customers deploying Solaris Containers (specifically zones) where different Containers are on different subnets, and it is expected that traffic between them leaves the system (maybe through a router or fireall to restrict or monitor inter-tier traffic).

With IP Instances in Solaris Nevada build 57 and targeted for Solaris 10 7/07, there is the ability to configures zones with exclusive IP Instances, thus forcing all traffic leaving a zone out onto the network. This introduces additional network stack processing both on the transmit and the receive. Prompted by some customer questions regarding this, I performed a simple test to measure the difference.

On two systems, a V210 with two 1.336GHz CPUs and 8GB memory, and an x4200 with two dual-core Opteron XXXX and 8GB memory, I ran FTP transfers between zones. My switch is a Netgear GS716T Smart Switch with 1Gbps ports. The V210 has four bge interfaces and the x4200 has four e1000g interfaces.

I created four zones. Zones x1 and x2 have eXclusive IP Instances, while zones s1 and s2 have Shared IP Instances (IP is shared with the global zone). Both systems are running Solaris 10 7/07 build 06.

Relevant zonecfg info is a follows (all zones are sparse):


v210# zonecfg -z x1 info
zonename: x1
zonepath: /localzones/x1
...
ip-type: exclusive
net:
        address not specified
        physical: bge1

v210# zonecfg -z s1 info
zonename: s1
zonepath: /localzones/s1
...
ip-type: shared
net:
        address: 10.10.10.11/24
        physical: bge3
 
As a test user in each zone, I created a file using 'mkfile 1000m /tmp/file1000m'. Then I used ftp to transfer it between zones. No tuning was done whatsoever.

The results are as follows.

V210: (bge)

Exclusive to Exclusive
x1# /usr/bin/time ftp x2 << EOF\^Jcd /tmp\^Jbin\^Jput file1000m\^JEOF

real       17.0
user        0.2
sys        11.2

Exclusive to Shared
x1# /usr/bin/time ftp s2 << EOF\^Jcd /tmp\^Jbin\^Jput file1000m\^JEOF

real       17.3
user        0.2
sys        11.6

Shared to Shared
s2# /usr/bin/time ftp s1 << EOF\^Jcd /tmp\^Jbin\^Jput file1000m\^JEOF

real        6.6
user        0.1
sys         5.3


X4200: (e1000g)

Exclusive to Exclusive
x1# /usr/bin/time ftp x2 << EOF\^Jcd /tmp\^Jbin\^Jput file1000m\^JEOF

real        9.1
user        0.0
sys         4.0

Exclusive to Shared
x1# /usr/bin/time ftp s2 << EOF\^Jcd /tmp\^Jbin\^Jput file1000m\^JEOF

real        9.1
user        0.0
sys         4.1

Shared to Shared
s2# /usr/bin/time ftp s1 << EOF\^Jcd /tmp\^Jbin\^Jput file1000m\^JEOF

real        4.0
user        0.0
sys         3.5
I ran each test several times and picked a result that seemed average across the runs. Not very scientific, and a table might be nicer.

Something I noticed that surprised me was that time spent in IP and the driver is measurable on the V210 with bge, and much less so on the x4200 with e1000g.

About

Stw-Oracle

Search

Archives
« May 2016
SunMonTueWedThuFriSat
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
    
       
Today