X

Using Solaris and SPARC Networking and Virtualization

Recent Posts

Solaris

Repo AI UAR: Step 0: Create Solaris Zones (optional)

pre { display: block; font-family: monospace; white-space: pre; margin: 1em 0; background: lightgreen;} OverviewThis is an optional step to create a Solaris Zone. The reason I prefer to use a Zone is that it is much easier to remove all traces of my work, especially when I know I might need to start over again. Also, I can move a Zone, if necessary, to another system.I also separate the Solaris Repository and the Solaris Automated Install Server into two Zones. Both could be in a single Zone, or together in the Global Zone. I do this to make it easier to redo one without affecting both.Zone Configuration FilesHere are the files I used to create the Zone.The first file is the Zone configuration file, which makes the process of customizing the Zone configuration a simple text edit operation. I have highlighted the items I change from the default.global# cat repo.cfgcreate -bset brand=solarisset zonepath=/zones/reposet autoboot=falseset autoshutdown=shutdownset ip-type=exclusiveadd anetset linkname=net0set lower-link=net0set configure-allowed-address=trueset link-protection=mac-nospoofset mac-address=autoendadd adminset user=guestset auths=login,manageendglobal#The zonepath is the directory or ZFS file system where the Zone is located. On any new system where I run Zones, one of the first things I do is create a separate ZFS file system for my Zones. See XXXXXX Appendix A.I set the interface my Zone will use by changing the anet lower-link property from the default auto.And to simplify managing the Zone, really logging into the Zone, by not becoming root, I add an admin user, myself. See zonecfg(1M) for more details.The second file configures the key details of the environment of the Zone, including hostname, IP address, timezone, etc. If I skip this step, there are a number of text screens with questions I have to answer on initial boot before the Zone is fully usable.global# cat sc_profile.xml<?xml version='1.0' encoding='UTF-8'?><!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1"><!-- Auto-generated by sysconfig --><service_bundle type="profile" name="sysconfig"> <service version="1" type="service" name="system/identity"> <instance enabled="true" name="node"> <property_group type="application" name="config"> <propval type="astring" name="nodename" value="repo"/> </property_group> </instance> </service> <service version="1" type="service" name="network/install"> <instance enabled="true" name="default"> <property_group type="application" name="install_ipv6_interface"> <propval type="astring" name="stateful" value="yes"/> <propval type="astring" name="address_type" value="addrconf"/> <propval type="astring" name="name" value="net0/v6"/> <propval type="astring" name="stateless" value="yes"/> </property_group> <property_group type="application" name="install_ipv4_interface"> <propval type="net_address_v4" name="static_address" value="192.168.1.101/24"/> <propval type="astring" name="name" value="net0/v4"/> <propval type="astring" name="address_type" value="static"/> <propval type="net_address_v4" name="default_route" value="192.168.1.1"/> </property_group> </instance> </service> <service version="1" type="service" name="network/physical"> <instance enabled="true" name="default"> <property_group type="application" name="netcfg"> <propval type="astring" name="active_ncp" value="DefaultFixed"/> </property_group> </instance> </service> <service version="1" type="service" name="system/name-service/switch"> <property_group type="application" name="config"> <propval type="astring" name="default" value="files"/> </property_group> <instance enabled="true" name="default"/> </service> <service version="1" type="service" name="system/name-service/cache"> <instance enabled="true" name="default"/> </service> <service version="1" type="service" name="system/keymap"> <instance enabled="true" name="default"> <property_group type="application" name="keymap"> <propval type="astring" name="layout" value="US-English"/> </property_group> </instance> </service> <service version="1" type="service" name="system/timezone"> <instance enabled="true" name="default"> <property_group type="application" name="timezone"> <propval type="astring" name="localtime" value="US/Eastern"/> </property_group> </instance> </service> <service version="1" type="service" name="system/environment"> <instance enabled="true" name="init"> <property_group type="application" name="environment"> <propval type="astring" name="LANG" value="en_US.UTF-8"/> </property_group> </instance> </service> <service version="1" type="service" name="system/config-user"> <instance enabled="true" name="default"> <property_group type="application" name="root_account"> <propval type="astring" name="type" value="role"/> <propval type="astring" name="login" value="root"/> <propval type="astring" name="password" value="$5$eXziZU0w$WXU3RrAdWE5jf3VtFvjy886R432QWRSCJRxnw.aomo."/> </property_group> <property_group type="application" name="user_account"> <propval type="astring" name="roles" value="root"/> <propval type="astring" name="shell" value="/usr/bin/bash"/> <propval type="astring" name="login" value="admin"/> <propval type="astring" name="password" value="$5$r4DCKuRh$KwJRBqNEIptAjSr/ZqRWzeAQEmNc8huxLxUkH.3QBn5"/> <propval type="astring" name="type" value="normal"/> <propval type="astring" name="sudoers" value="ALL=(ALL) ALL"/> <propval type="count" name="gid" value="10"/> <propval type="astring" name="description" value="admin"/> <propval type="astring" name="profiles" value="System Administrator"/> </property_group> </instance> </service> <service version="1" type="service" name="system/fm/asr-notify"> <instance enabled="true" name="default"> <property_group type="application" name="autoreg"> <propval type="astring" name="user" value="anonymous@oracle.com"/> </property_group> </instance> </service> <service version="1" type="service" name="system/ocm"> <instance enabled="true" name="default"> <property_group type="application" name="reg"> <propval type="astring" name="user" value="anonymous@oracle.com"/> </property_group> </instance> </service></service_bundle>global# Again, I have highlighted in bold some of the items I set per Zone. The Zone's name and IP details, typically. In this case you will want to change the encrypted passwords as well, and possibly the default user. Use sysconfig create-profile if you want to start with a new base. See sysconfig(1M) for details.I will do this a second time for the Automated Installer Zone, which I call ai.The differences between the two Zones' configuration files can be easily seen.global# diff repo.cfg ai.cfg 3c3< set zonepath=/zones/repo ---> set zonepath=/zones/ai global# global# diff repo_sc_profile.xml ai_sc_profile.xml 8c8< <propval type="astring" name="nodename" value="repo"/>---> <propval type="astring" name="nodename" value="ai"/>21c21< <propval type="net_address_v4" name="static_address" value="192.168.1.101/24"/>---> <propval type="net_address_v4" name="static_address" value="192.168.1.102/24"/>global# Let's create the zone using the configuration file.global# zonecfg -z repo -f ./repo.cfg UX: /usr/sbin/usermod: admin is currently logged in, some changes may not take effect until next login.global# The warning about the privilege changes not taking affect until the next login really only apply the first time the user is added to the /etc/user_attr file. If the user already exists, I have been able to manage new Zones from an existing login session.Now install the Zone, pointing to the System Configuration Tool file to not be required to answer host related questions on startup.global# zoneadm -z repo install -c ./repo_sc_profile.xml The following ZFS file system(s) have been created: pool1/zones/repoProgress being logged to /var/log/zones/zoneadm.20160414T205521Z.repo.install Image: Preparing at /zones/repo/root. Install Log: /system/volatile/install.28655/install_log AI Manifest: /tmp/manifest.xml.G6ay83 SC Profile: /export/blog/zones/repo_sc_profile.xml Zonename: repoInstallation: Starting ... Creating IPS imageStartup linked: 1/1 done Installing packages from: solaris origin: http://pkg.oracle.com/solaris/release/ origin: http://mylocalrepository/DOWNLOAD PKGS FILES XFER (MB) SPEEDCompleted 282/282 50021/50021 344.9/344.9 1.1M/sPHASE ITEMSInstalling new actions 68140/68140Updating package state database Done Updating package cache 0/0 Updating image state Done Creating fast lookup database Done Updating package cache 1/1 Installation: Succeeded Note: Man pages can be obtained by installing pkg:/system/manual done. Done: Installation completed in 491.340 seconds. Next Steps: Boot the zone, then log into the zone console (zlogin -C) to complete the configuration process.Log saved in non-global zone as /zones/repo/root/var/log/zones/zoneadm.20160414T205521Z.repo.installglobal# Once the Zone is installed, boot it, then either in the same session or a different one, log into the Zone's console. I tend to have two sessions open for this.The URL "http://mylocalrepository/" you see in the section of "installing from" above is pointing to the existing Repository I am using for my systems in general. I need a Repository to be able to install Solaris Zones. So I am create a new Repository Zone accessing an existing Repository elsewhere. This could be "pkg.oracle.com/solaris/support" if your system is already set up with access to the Oracle hosted Repository with authorization as an Oracle customer, or could be another Repository on your network.global# zoneadm -z repo boot global# In my second session, I don't need to be root.admin@global$ pfexec zlogin -C repo [Connected to zone 'repo' console][NOTICE: Zone booting up]SunOS Release 5.11 Version 11.3 64-bitCopyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.Loading smf(5) service descriptions: 126/126Hostname: repoApr 14 17:05:32 repo sendmail[982]: My unqualified host name (repo) unknown; sleeping for retryApr 14 17:05:32 repo sendmail[984]: My unqualified host name (repo) unknown; sleeping for retryrepo console login: guest Password: Oracle Corporation SunOS 5.11 11.3 February 2016guest@repo:~$ guest@repo:~$ ipadm show-addr ADDROBJ TYPE STATE ADDRlo0/v4 static ok 127.0.0.1/8net0/v4 static ok 192.168.1.101/24lo0/v6 static ok ::1/128net0/v6 addrconf ok fe80::8:20ff:fef1:7702/10guest@repo:~$ Once I know the Zone is running, I tend to go into the Zone without going through the console. It is quicker as I don't need to authenticate, and I am already a root user.guest@global:~$ pfexec zlogin repo [Connected to zone 'repo' pts/4]Last login: Thu Apr 14 17:07:52 2016 on pts/3Oracle Corporation SunOS 5.11 11.3 February 2016root@repo:~# One change I make is I add .local to /etc/hosts to avoid sendmail noise. This won't be necessary if DNS is configured, and also if sendmail is disabled using svcadm(1M).root@repo:~# cat /etc/hosts ## Copyright 2009 Sun Microsystems, Inc. All rights reserved.# Use is subject to license terms.## Internet host table#::1localhost127.0.0.1localhost loghost192.168.1.101repo repo.local root@repo:~# Now do the same for the "ai" Zone, and you are ready for the first step.Appendix A: Creating a ZFS File System for ZonesIf you have not created Solaris Zones before on the system, you may find it helpful to create a ZFS File System for the Zones. When the zonepath is in a ZFS File System, a new file system will be created for the Zone. This allows for rapid Zone creation by using Zone cloning which under the covers will use ZFS Cloning. I don't show that here, however, I almost always go through these steps anyway.global# zfs create -o mountpoint=/zones rpool/zones This created a new ZFS File System rpool/zones in the root pool, and creates the mountpoint "/zones". Then the zonepath can be as simple as "/zones/zonename".If you have additional ZFS pool(s) and are using one of them, replace "rpool" with the name of the pool you are using.Revision History(Other than minor typographical changes)2016.10.26: Posted2016.04.29: Created

Overview This is an optional step to create a Solaris Zone. The reason I prefer to use a Zone is that it is much easier to remove all traces of my work, especially when I know I might need to start...

Solaris

Repo AI UAR: Step 1: Create a Solaris 11 Repository

pre { display: block; font-family: monospace; white-space: pre; margin: 1em 0; background: lightgreen;} OverviewIn this section I will create a Solaris Repository and populate it with the Release and the Support Repository Update (SRU) bits. While the examples here show only Solaris 11.3, a single Repository can hold all releases of Solaris and all SRUs.The recommendation is to put all SRUs into the repository, not just the ones you believe you will install. While this consumes more space, it may avoid problems later.I have built a repository add SRUs out of order. While I haven't fully compared that to one created in order, a Repository verify was successful.Create the Initial RepositoryLog into the system where you will build the Repository. In my case I am doing a "zlogin repo".I add a hostname.local to my /etc/hosts file to suppress sendmail messages.root@repo:~# cat /etc/hosts ## Copyright 2009 Sun Microsystems, Inc. All rights reserved.# Use is subject to license terms.## Internet host table#::1localhost127.0.0.1localhost loghost192.168.1.101repo repo.local root@repo:~# My first step is to create a new ZFS file system where I put the Repository. This allows me to snapshot, ZFS send, or any other ZFS operation on the Repository separate from the other file systems.root@repo:~# zfs create -o mountpoint=/repo rpool/repo root@repo:~# zfs create rpool/repo/solaris root@repo:~# zfs set atime=off rpool/repo/solaris root@repo:~# I followed the Solaris documentation to turn of access time, to reduce updates to meta data in ZFS.Because I have downloaded the GA Solaris 11.3 repository image and all SRU repository update images to my Zlobal Zone, I can mount those into my repo Zone. This step is completely optional. You must get the repo files into your OS somehow.global# mount -F lofs -o ro /export/iso/s11.3/ /zones/repo/root/mnt/ root@repo:~# ls /mnt buildnumber ga sru02.4 sru03.6 sru04.5 sru05.6 sru06.5root@repo:~# The Solaris 11.3 base download files include a full Repository. This is pretty large, and thus split into five files. Separately, there is an assembly script (install-repo.sh) to download. I always use the one included in with the Repository images.I make a point to do a verify on the initial installation using the "-v" option. The verification takes a long time. By long I mean 30-60 minutes on a 3GHz x86 system with mirrored 7200RPM 2TB SATA disks. I will also do this after every few SRUs.root@repo:/mnt/ga/zipped-repo# ./install-repo.ksh -d /repo/solaris/ -v Using sol-11_3_1_5_0-repo download.Uncompressing sol-11_3_1_5_0-repo_1of5.zip...done.Uncompressing sol-11_3_1_5_0-repo_2of5.zip...done.Uncompressing sol-11_3_1_5_0-repo_3of5.zip...done.Uncompressing sol-11_3_1_5_0-repo_4of5.zip...done.Uncompressing sol-11_3_1_5_0-repo_5of5.zip...done.Repository can be found in /repo/solaris/.Initiating repository verification.root@repo:/mnt/ga/zipped-repo# Populate the Repository with Monthly Support Repository Updates (SRUs)Once the base Repository is in place, it is an iterative process to apply each SRU. In my case I have separate directories for each SRU, and I just walk though each one. I don't verify to save time.root@repo:/mnt/ga/zipped-repo# root@repo:/mnt/ga/zipped-repo# cd /mnt/sru02.4/zipped-repo/ root@repo:/mnt/sru02.4/zipped-repo# root@repo:/mnt/sru02.4/zipped-repo# ./install-repo.ksh -d /repo/solaris/ Using sol-11_3_2_4_0-incr-repo download.IPS repository exists at destination /repo/solaris/Current version: 0.175.3.1.0.5.0Do you want to add to this repository? (y/n)[n]: y Uncompressing sol-11_3_2_4_0-incr-repo.zip...done.Repository can be found in /repo/solaris/.Initiating repository rebuild.root@repo:/mnt/sru02.4/zipped-repo# You may have noticed that I am prompted to verify I want to add to an existing Repository. This is my check that I have the path to the Repository correct.For the next SRUs, I will override that check with the "-y" option.root@repo:/mnt/sru02.4/zipped-repo# cd /mnt/sru03.6/zipped-repo/ root@repo:/mnt/sru03.6/zipped-repo# root@repo:/mnt/sru03.6/zipped-repo# ./install-repo.ksh -d /repo/solaris/ -y Using sol-11_3_3_6_0-incr-repo download.IPS repository exists at destination /repo/solaris/Current version: 0.175.3.2.0.4.0Adding packages to existing repository.Uncompressing sol-11_3_3_6_0-incr-repo_1of2.zip...done.Uncompressing sol-11_3_3_6_0-incr-repo_2of2.zip...done.Repository can be found in /repo/solaris/.Initiating repository rebuild.root@repo:/mnt/sru03.6/zipped-repo# The current SRU level for Solaris 11.3 at this writing is 6.5, however, I will stop at 5.6 to be able to show effects of updating both the repository and an installed image.My system is running Solaris 11.3 SRU 6.5. This highlights that a Repository and its contents is completely independent of what the system is running at.root@repo:/mnt/sru05.6/zipped-repo# pkg list entire NAME (PUBLISHER) VERSION IFOentire 0.5.11-0.175.3.6.0.5.0 i--root@repo:/mnt/sru05.6/zipped-repo# pkg info entire Name: entire Summary: entire incorporation including Support Repository Update (Oracle Solaris 11.3.6.5.0). Description: This package constrains system package versions to the same build. WARNING: Proper system update and correct package selection depend on the presence of this incorporation. Removing this package will result in an unsupported system. For more information see: https://support.oracle.com/rs?type=doc&id=2045311.1 Category: Meta Packages/Incorporations State: Installed Publisher: solaris Version: 0.5.11 (Oracle Solaris 11.3.6.5.0) Build Release: 5.11 Branch: 0.175.3.6.0.5.0Packaging Date: March 9, 2016 10:05:55 PM Size: 5.46 kB FMRI: pkg://solaris/entire@0.5.11,5.11-0.175.3.6.0.5.0:20160309T220555Zroot@repo:/mnt/sru05.6/zipped-repo# Create the Repository ServerI have install GA and four SRUs into the repository, yet I have not really created a repository server yet.root@repo:~# pkg search -s http://192.168.1.101 entire pkg: Some repositories failed to respond appropriately:http://10.143.156.62:Unable to contact valid package repositoryEncountered the following error(s):Unable to contact any configured publishers.This is likely a network configuration problem.Framework error: code: 7 reason: Failed to connect to 192.168.1.101 port 80: Connection refusedURL: 'http://192.168.1.101' (happened 4 times)root@repo:~# So a quick check of what the system is listening to shows no port 80.root@repo:~# netstat -anf inet ...TCP: IPv4 Local Address Remote Address Swind Send-Q Rwind Recv-Q State-------------------- -------------------- ------- ------ ------- ------ -----------127.0.0.1.5999 *.* 0 0 128000 0 LISTEN *.111 *.* 0 0 128000 0 LISTEN *.* *.* 0 0 128000 0 IDLE *.111 *.* 0 0 128000 0 LISTEN *.* *.* 0 0 128000 0 IDLE *.22 *.* 0 0 128000 0 LISTEN *.22 *.* 0 0 128000 0 LISTEN127.0.0.1.4999 *.* 0 0 128000 0 LISTEN127.0.0.1.25 *.* 0 0 128000 0 LISTEN127.0.0.1.587 *.* 0 0 128000 0 LISTENroot@repo:~# I need to set up some base configuration details to run the server. I am calling this instance "solaris" and setting the location of the Repository.root@repo:~# svccfg -s pkg/server add solaris root@repo:~# svccfg -s pkg/server:solaris setprop pkg/inst_root=/repo/solaris/ root@repo:~# svcadm refresh pkg/server:solaris root@repo:~# svcadm enable pkg/server:solaris root@repo:~# svcs server:solaris STATE STIME FMRIonline 8:10:50 svc:/application/pkg/server:solarisroot@repo:~# root@repo:~# netstat -anf inet ...TCP: IPv4 Local Address Remote Address Swind Send-Q Rwind Recv-Q State-------------------- -------------------- ------- ------ ------- ------ -----------127.0.0.1.5999 *.* 0 0 128000 0 LISTEN *.111 *.* 0 0 128000 0 LISTEN *.* *.* 0 0 128000 0 IDLE *.111 *.* 0 0 128000 0 LISTEN *.* *.* 0 0 128000 0 IDLE *.22 *.* 0 0 128000 0 LISTEN *.22 *.* 0 0 128000 0 LISTEN127.0.0.1.4999 *.* 0 0 128000 0 LISTEN127.0.0.1.25 *.* 0 0 128000 0 LISTEN127.0.0.1.587 *.* 0 0 128000 0 LISTEN *.80 *.* 0 0 128000 0 LISTEN127.0.0.1.32945 127.0.0.1.80 130880 0 139264 0 TIME_WAITroot@repo:~# After enabling the service with svcadm see that it is running and can see something is listening on the default port 80. This is one reason I like to run it in a Zone, as it won't conflict with any other HTTP server that may be running on my system.Now if I search again, success.root@repo:~# pkg search -s http://192.168.1.101 entire INDEX ACTION VALUE PACKAGErequire depend pkg://solaris/entire@0.5.11,5.11-0.175.3.4.0.5.0 pkg:/support/critical-patch-update/solaris-11-cpu@2016.1-1require depend pkg://solaris/entire@0.5.11,5.11-0.175.3.2.0.4.0 pkg:/support/critical-patch-update/solaris-11-cpu@2015.11-1require depend pkg://solaris/entire@0.5.11,5.11-0.175.3.5.0.6.0 pkg:/support/critical-patch-update/solaris-11-cpu@2016.2-1require depend pkg://solaris/entire@0.5.11,5.11-0.175.3.3.0.6.0 pkg:/support/critical-patch-update/solaris-11-cpu@2015.12-1pkg.description set Provides for power management support of the entire operating system, including the configuration of the maximum time allowed to reach both minimum and full capacity, and whether or not to permit system suspend and resume if the platform supports it. pkg:/system/kernel/power@0.5.11-0.175.3.0.0.30.0pkg.description set pixz compresses and decompresses files using multiple processors. If the input looks like a tar(1) archive, it also creates an index of all the files in the archive. This allows the extraction of only a small segment of the tarball, without needing to decompress the entire archive. pkg:/compress/pixz@1.0-0.175.3.0.0.30.0root@repo:~# So my initial Repository setup is complete, and I can see I have four SRUs installed. Later I may add another SRU.Appendix A: Verifying the Repository ManuallyIn my introduction to this section I mention manually verifying the repository. The step is simply.root@repo:~# pkgrepo -s /repo/solaris/repo verify Initiating repository verification.Scanning repository (this could take some time)The verification can take a long time, possibly half an hour or longer, depending on the speed of your system.Revision History(Other than minor typographical changes)2016.10.26: Posted2016.04.21: Created

Overview In this section I will create a Solaris Repository and populate it with the Release and the Support Repository Update (SRU) bits. While the examples here show only Solaris 11.3, a single...

Solaris

Repo AI UAR: Step 2: Create a Solaris 11 Automated Install Server

pre { display: block; font-family: monospace; white-space: pre; margin: 1em 0; background: lightgreen;} OverviewThe automated way of installing a Solaris Global or Non-Global Zone is the Automated Installer, which replaces Jumpstart in Solaris 10 and prior. You can use AI with the repositories Oracle provides at http://pkg.oracle.com or you can use your own. The previous section demonstrates the steps to create your own, and that is what this section will use.In keeping with my practice of using Zones for this, I am using the ai Zone I created in Step 0. While recreating a Zone is pretty easy, if this were a system (Global Zone) that may not be so easy. Thus I like to create a new Boot Environment just in case I make a critical mistake.root@ai:/mnt/sru06.5# beadm create initial-airoot@ai:/mnt/sru06.5# beadm activate initial-airoot@ai:/mnt/sru06.5# reboot[Connection to zone 'ai' pts/6 closed]global# While the Zone is rebooting, since my downloads are stored in the Global Zone, I will make access to the necessary ISO from there.global# mount -F lofs -o ro /export/iso/s11.3 /zones/ai/root/mnt/global# This give me read-only access in the Zone at mount point "/mnt".Back in the Zone, I'll validate I am in my new BE.root@ai:~# beadm listBE Flags Mountpoint Space Policy Created -- ----- ---------- ----- ------ ------- initial-ai NR / 835.24M static 2016-04-14 19:20 solaris - - 211.27M static 2016-04-14 17:47 root@ai:~# The necessary packages to be an IA server are not included in the default installs, so I need to install them.root@ai:~# pkg install installadm Packages to install: 10 Mediators to change: 3 Services to change: 2 Create boot environment: NoCreate backup boot environment: NoDOWNLOAD PKGS FILES XFER (MB) SPEEDCompleted 10/10 1220/1220 12.8/12.8 1.4M/sPHASE ITEMSInstalling new actions 1561/1561Updating package state database Done Updating package cache 0/0 Updating image state Done Creating fast lookup database Done Updating package cache 1/1 root@ai:~# There are images that Oracle provides specifically for AI installations. I use the latest one available, since I know it will support older versions of Solaris, should I need that.root@ai:~# cd /mnt/sru06.5/root@ai:/mnt/sru06.5# lsfileRgood sol-11_3_6_5_0-fallback_boot-sparc.pkgmd5sum-repo-iso-11_3_6_5_0.txt sol-11_3_6_5_0-incr-repo.isomd5sums.txt sol-11_3_6_5_0-text-sparc.isoREADME.11.3.6 sol-11_3_6_5_0-text-x86.isosol-11_3_6_5_0-ai-sparc.iso zipped-reposol-11_3_6_5_0-ai-x86.isoroot@ai:/mnt/sru06.5# The tool to create and manage the AI server is "installadm(1M)". I can create services, manifests, define profiles, set clients, and other operations. The first step is to create a new service. Note that this is specific to the instruction set. Here I am creating one for SPARC systems. If I also want to install x86 systems, I would do this again with the AI ISO for x86.Because I am on a shared network, I avoid things such as DHCP or Multicast DNS.root@ai:/mnt/sru06.5# installadm create-service -s ./sol-11_3_6_5_0-ai-sparc.iso -y 0% : Service svc:/network/dns/multicast:default is not online. Installation services will not be advertised via multicast DNS. 0% : Creating service from: /mnt/sru06.5/sol-11_3_6_5_0-ai-sparc.iso 10% : Transferring contents 10% : Creating sparc service: solaris11_3_6-sparc_1 10% : Image path: /export/auto_install/solaris11_3_6-sparc_1 10% : Setting "solaris" publisher URL in default manifest to: 10% : http://pkg.oracle.com/solaris/release/ 10% : DHCP is not being managed by install server. 10% : SMF Service 'svc:/system/install/server:default' will be enabled 10% : SMF Service 'svc:/network/t ftp/udp6:default' will be enabled 10% : Creating default-sparc alias 10% : Setting "solaris" publisher URL in default manifest to: 10% : http://pkg.oracle.com/solaris/release/ 10% : DHCP is not being managed by install server. 10% : No local DHCP configuration found. This service is the default 10% : alias for all SPARC clients. If not already in place, the following should 10% : be added to the DHCP configuration: 10% : Boot file: http://192.168.1.102:5555/cgi-bin/wanboot-cgi 10% : SMF Service 'svc:/system/install/server:default' will be enabled 10% : SMF Service 'svc:/network/t ftp/udp6:default' will be enabled100% : Created Service: 'solaris11_3_6-sparc_1'100% : Refreshing SMF service svc:/network/t ftp/udp6:default100% : Refreshing SMF service svc:/system/install/server:default100% : Enabling SMF service svc:/system/install/server:default100% : Enabling SMF service svc:/network/t ftp/udp6:default100% : Warning: mDNS registry of service 'solaris11_3_6-sparc_1' could not be verified.100% : Warning: mDNS registry of service 'default-sparc' could not be verified.root@ai:/mnt/sru06.5# A quick check that my service did install correctly. Yes, it did.root@ai:/mnt/sru06.5# svcs install/serverSTATE STIME FMRIonline 19:55:28 svc:/system/install/server:default root@ai:/mnt/sru06.5# Let's see what was installed. I see two Services, one of which is an alias of the other. For all subsequent steps I will be using the "default_sparc" service even though it is an alias. If I were to use a different version of an AI ISO, I could then choose which one is the default.root@ai:/mnt/sru06.5# installadm listService Name Status Arch Type Secure Alias Aliases Clients Profiles Manifests------------ ------ ---- ---- ------ ----- ------- ------- -------- ---------default-sparc on sparc iso no yes 0 0 0 1 solaris11_3_6-sparc_1 on sparc iso no no 1 0 0 1 root@ai:/mnt/sru06.5# That is the minimum required to set up an IA server. How about we try it out?Because I am not using DHCP, first step it set up WANboot variables in OpenBoot PROM. This has been supported for many years, and Solaris' Automated Installer is using those OBP variable to set the local host parameters, including the network, gate, and also which AI server to load from.At the expense of making this entry a bit longer, I am including full output the first time for each operation. Later on I will truncate it.{0} ok printenv network-boot-argumentsnetwork-boot-arguments = {0} ok {0} ok setenv network-boot-arguments host-ip=192.186.1.6,router-ip=192.168.1.1,subnet-mask=255.255.255.0,hostname=host1,file=http://192.168.1.102:5555/cgi-bin/wanboot-cgi {0} ok {0} ok printenv network-boot-argumentsnetwork-boot-arguments = host-ip=192.186.1.6.6,router-ip=192.168.1.1,subnet-mask=255.255.255.0,hostname=host1,file=http://192.168.1.102:5555/cgi-bin/wanboot-cgi{0} ok NOTICE: Entering OpenBoot.NOTICE: Fetching Guest MD from HV.NOTICE: Starting additional cpus.NOTICE: Initializing LDC services.NOTICE: Probing PCI devices.NOTICE: Finished PCI probing.SPARC T4-1, No KeyboardCopyright (c) 1998, 2015, Oracle and/or its affiliates. All rights reserved.OpenBoot 4.38.3, 4.0000 GB memory available, Serial #83545684.Ethernet address 0:14:4f:fa:ce:54, Host ID: 84face54.{0} ok boot net - installBoot device: /virtual-devices@100/channel-devices@200/network@1 File and args: - install<time unavailable> wanboot info: WAN boot messages->console<time unavailable> wanboot info: configuring /virtual-devices@100/channel-devices@200/network@1<time unavailable> wanboot progress: wanbootfs: Read 368 of 368 kB (100%)<time unavailable> wanboot info: wanbootfs: Download completeFri Apr 15 00:49:32 wanboot progress: miniroot: Read 276976 of 276976 kB (100%)Fri Apr 15 00:49:32 wanboot info: miniroot: Download completeSunOS Release 5.11 Version 11.3 64-bitCopyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.Remounting root read/writeProbing for device nodes ...Preparing network image for useDownloading solaris.zlibcurl arguments --insecure for http://ai:5555//export/auto_install/solaris11_3_6-sparc_1//solaris.zlib % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed100 243M 100 243M 0 0 67.1M 0 0:00:03 0:00:03 --:--:-- 67.1MDownloading solarismisc.zlibcurl arguments --insecure for http://ai:5555//export/auto_install/solaris11_3_6-sparc_1//solarismisc.zlib % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed100 23.1M 100 23.1M 0 0 77.9M 0 --:--:-- --:--:-- --:--:-- 78.4MDownloading .image_infocurl arguments --insecure for http://ai:5555//export/auto_install/solaris11_3_6-sparc_1//.image_info % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed100 84 100 84 0 0 18530 0 --:--:-- --:--:-- --:--:-- 42000Done mounting imageConfiguring devices.Hostname: host1Service discovery phase initiatedService name to look up: default-sparcService discovery over multicast DNS failedService default-sparc located at 192.168.1.102:5555 will be usedService discovery finished successfullyProcess of obtaining install manifest initiatedUsing the install manifest obtained via service discoveryhost1 console login: Automated Installation startedThe progress of the Automated Installation will be output to the consoleDetailed logging is in the logfile at /system/volatile/install_logPress RETURN to get a login prompt at any time.00:46:54 Install Log: /system/volatile/install_log00:46:54 Using Derived Script: /system/volatile/ai.xml00:46:54 Using profile specification: /system/volatile/profile00:46:54 Using service list file: /var/run/service_list00:46:54 Starting installation.00:46:54 Deriving manifest from: /system/volatile/ai.xml00:46:54 Derived /system/volatile/ai.xml stored00:46:54 Registering Derived Manifest Module Checkpoint00:46:54 0% Preparing for Installation00:46:54 Derived Manifest Module: Creating/modifying manifest at "/system/volatile/manifest.xml"00:46:54 2% Preparing for Installation00:46:54 Derived Manifest Module: Script to run: /system/volatile/ai.xml00:46:54 Derived Manifest Module: script mode is 0755, uid is 0, gid is 000:46:55 Derived Manifest Module: Script validated. Running in subprocess...00:46:55 Derived Manifest Module: No previous manifest at /system/volatile/manifest.xml exists.00:46:55 Derived Manifest Module: Creating empty manifest with (uid,gid) = (61,61)00:46:56 Derived Manifest Module: script output follows: 00:46:57 > Solaris consolidation manifest entry is pkg:/entire@0.5.11-0.175.300:46:57 > 00:46:57 Derived Manifest Module: aimanifest logfile output follows: 00:46:57 >> 00:46:56: aimanifest: INFO: command:load, incremental:False, file:/tmp/default.xml.1ba4Ib00:46:57 >> 00:46:56: aimanifest: INFO: cmd:success, validation:Pass00:46:58 >> 00:46:56: aimanifest: INFO: File: /tmp/default.xml.1ba4Ib00:46:58 >> 00:46:57: aimanifest: INFO: command:get, path:/auto_install/ai_instance/software[@type=IPS]/software_data[@action=install]/name00:46:58 >> 00:46:57: aimanifest: INFO: successful: returns value:pkg:/entire@0.5.11-0.175.3, path:/auto_install[1]/ai_instance[1]/software[1]/software_data[1]/name[1]00:46:58 >> 00:46:57: aimanifest: INFO: successful: returns value:pkg:/group/system/solaris-large-server, path:/auto_install[1]/ai_instance[1]/software[1]/software_data[1]/name[2]00:46:58 Derived Manifest Module: script completed successfully00:46:58 Derived Manifest Module: Manifest header refers to no DTD.00:46:58 Derived Manifest Module: Validating against DTD: /usr/share/install/ai.dtd.100:46:58 Derived Manifest Module: XML validation completed successfully 00:46:58 84% derived-manifest completed.00:46:58 100% manifest-parser completed.00:46:58 100% None00:46:58 DM set manifest to: /system/volatile/manifest.xml00:46:58 0% Preparing for Installation00:46:59 1% Preparing for Installation00:46:59 2% Preparing for Installation00:46:59 3% Preparing for Installation00:46:59 4% Preparing for Installation00:46:59 8% install-env-configuration completed.00:47:01 24% target-discovery completed.00:47:01 Selected Disk(s) : c1d000:47:01 30% target-selection completed.00:47:01 33% ai-configuration completed.00:47:01 35% var-share-dataset completed.00:47:11 36% target-instantiation completed.00:47:12 36% Beginning IPS transfer00:47:12 Creating IPS image00:47:12 Error occurred during execution of 'generated-transfer-775-1' checkpoint.00:47:12 100% None00:47:12 Failed Checkpoints:00:47:12 00:47:12 generated-transfer-775-100:47:12 00:47:12 Checkpoint execution error:00:47:12 00:47:12 Framework error: code: 6 reason: Couldn't resolve host 'pkg.oracle.com'00:47:12 URL: 'http://pkg.oracle.com/solaris/release/versions/0/'00:47:12 00:47:12 Automated Installation Failed. See install log at /system/volatile/install_logAutomated Installation failedPlease refer to the /system/volatile/install_log file for detailsApr 15 00:47:12 host1 svc.startd[11]: application/auto-installer:default failed fatally: transitioned to maintenance (see 'svcs -xv' for details)SUNW-MSG-ID: SMF-8000-YX, TYPE: defect, VER: 1, SEVERITY: majorEVENT-TIME: Fri Apr 15 00:51:39 UTC 2016PLATFORM: SPARC-T4-1, CSN: unknown, HOSTNAME: host1SOURCE: software-diagnosis, REV: 0.1EVENT-ID: e3474fb5-8f23-4b64-9f87-fcd32a4e871cDESC: A service failed - a start, stop or refresh method failed.AUTO-RESPONSE: The service has been placed into the maintenance state.IMPACT: svc:/application/auto-installer:default is unavailable.REC-ACTION: Run 'svcs -xv svc:/application/auto-installer:default' to determine the generic reason why the service failed, the location of any logfiles, and a list of other services impacted. Please refer to the associated reference document at http://support.oracle.com/msg/SMF-8000-YX for the latest service procedures and policies regarding this diagnosis.host1 console login: The installation failed. I could log into the install image session and diagnose this. In this case the problem is clear--AI needs to access a Solaris repository, and the one in the Install Manifest points to the Solaris 11.3 GA image at pkg.oracle.com. (I have that highlighted in the output above.) I have to modify the manifest to point to my repository. To log in, the login/password combination of root/solaris works.This requires setting up an installation Manifest. To make all my files I create a ZFS file system to put the files into. There I will create a directory for the manifests. Later I will need a second directory, so I am preparing for that now.root@ai:/mnt/sru06.5# cdroot@ai:~# root@ai:~# zfs create rpool/export/airoot@ai:~# cd /export/airoot@ai:/export/ai# mkdir manifestsroot@ai:/export/ai# cd manifestsroot@ai:/export/ai/manifests# First, here are some common "what is my current state" commands.root@ai:/export/ai/manifests# installadm listService Name Status Arch Type Secure Alias Aliases Clients Profiles Manifests------------ ------ ---- ---- ------ ----- ------- ------- -------- ---------default-sparc on sparc iso no yes 0 0 0 1 solaris11_3_6-sparc_1 on sparc iso no no 1 0 0 1 root@ai:/export/ai/manifests# root@ai:/export/ai/manifests# installadm list -mService Name Manifest Name Type Status Criteria------------ ------------- ---- ------ --------default-sparc orig_default derived default none solaris11_3_6-sparc_1 orig_default derived default none root@ai:/export/ai/manifests# There is a default manifest, so I extract that using the export subcommand. I use a name to remind me it is the default original in the system. I will show you what it looks like here as well.root@ai:/export/ai/manifests# installadm export -n default-sparc -m orig_default -o ./orig_default.xmlroot@ai:/export/ai/manifests# root@ai:/export/ai/manifests# cat orig_default.xml #!/bin/ksh93## Copyright (c) 2013, 2014, Oracle and/or its affiliates. All rights reserved.# The default AI manifest is a Derived Manifest (DM) script. The# script creates a temp file from xml within the script, loads that# file in as the manifest, then replaces the release of Solaris to# install with the same release as that running on the AI client.SCRIPT_SUCCESS=0SOLPKG="entire"TMPFILE=`/usr/bin/mktemp /tmp/default.xml.XXXXXX`## create_xml_file# Create xml tmp file from here document. The contents of the#here document are inserted during installadm create-service.#function create_xml_file{# Create xml tmp filecat <<- EOF > $TMPFILE<!DOCTYPE auto_install SYSTEM "file:///usr/share/install/ai.dtd.1"><auto_install> <ai_instance name="default"> <target> <logical> <zpool name="rpool" is_root="true"> <!-- Subsequent <filesystem> entries instruct an installer to create following ZFS datasets: <root_pool>/export (mounted on /export) <root_pool>/export/home (mounted on /export/home) Those datasets are part of the standard environment and should always be created. In rare cases, if there is a need to deploy an installed system without these datasets, either comment out or remove <filesystem> entries. In such scenario, it has to be also assured that in case of non-interactive post-install configuration, creation of initial user account is disabled in related system configuration profile. Otherwise the installed system would fail to boot. --> <filesystem name="export" mountpoint="/export"/> <filesystem name="export/home"/> <be name="solaris"/> </zpool> </logical> </target> <software type="IPS"> <destination> <image> <!-- Specify locales to install --> <facet set="false">facet.locale.*</facet> <facet set="true">facet.locale.de</facet> <facet set="true">facet.locale.de_DE</facet> <facet set="true">facet.locale.en</facet> <facet set="true">facet.locale.en_US</facet> <facet set="true">facet.locale.es</facet> <facet set="true">facet.locale.es_ES</facet> <facet set="true">facet.locale.fr</facet> <facet set="true">facet.locale.fr_FR</facet> <facet set="true">facet.locale.it</facet> <facet set="true">facet.locale.it_IT</facet> <facet set="true">facet.locale.ja</facet> <facet set="true">facet.locale.ja_*</facet> <facet set="true">facet.locale.ko</facet> <facet set="true">facet.locale.ko_*</facet> <facet set="true">facet.locale.pt</facet> <facet set="true">facet.locale.pt_BR</facet> <facet set="true">facet.locale.zh</facet> <facet set="true">facet.locale.zh_CN</facet> <facet set="true">facet.locale.zh_TW</facet> </image> </destination> <source> <!-- By default, IPS packages will be installed from publisher "solaris" located at the URI specified by the origin name below. You may specify a different IPS repository to install from by changing the origin name and/or publisher name. You may also specify multiple publishers. --> <publisher name="solaris"> <origin name="http://pkg.oracle.com/solaris/release/"/> </publisher> </source> <!-- The version specified by the "entire" package below, is installed from the specified IPS repository. If another version is required, the 'entire' package should be specified in the following form: <name>pkg:/entire@0.5.11,5.11-0.175.update.sru.platform.build.rev</name> For instance, to specify a particular build of S11.3, the following should be used: <name>pkg:/entire@0.5.11,5.11-0.175.3.0.0.build</name> --> <software_data action="install"> <name>pkg:/entire@0.5.11-0.175.3</name> <name>pkg:/group/system/solaris-large-server</name> </software_data> </software> </ai_instance></auto_install> EOF}## error_handler# Error handling function#function error_handler{exit $?}## load_xml# Load the default manifest from previously created tmp file#function load_xml{# load the default manifesttrap error_handler ERR/usr/bin/aimanifest load $TMPFILEtrap - ERR}## update_solaris_version# Update the manifest entry of the Solaris consolidation package#so that the release of Solaris being installed is the same as#that running on the client.#function update_solaris_version{# If $SI_SYSPKG is not set in the environment, then the Solaris# consolidation pkg was not found on the client. Return without# making modifications to the manifest, but not an error.if [ -z ${SI_SYSPKG} ]; thenecho "'${SOLPKG}' package not found on system"echo "Unable to constrain Solaris version being installed."returnfi# Get list of pkgs to install from the manifestpkgs=$(/usr/bin/aimanifest get -r \/auto_install/ai_instance/software[@type="IPS"]/software_data[@action="install"]/name \2> /dev/null)# array will be formatted as: <pkg> <aimanifest path>...array=($(echo ${pkgs} | /usr/bin/nawk 'BEGIN{FS="\n"} {print $NF}'))# Find the Solaris consolidation package manifest entry and change# it to use the version specified by the environment variable,# $SI_SYSPKGidx=0while [ $idx -lt ${#array[@]} ]; dopkgname=${array[$idx]}# check if pkgname is Solaris consolidation packageecho $pkgname | /usr/bin/egrep -s \"(^*/|^)${SOLPKG}($|@[a-zA-Z0-9.,:-]*$)"if [ $? -ne 0 ]; then# look at next pkg(( idx=idx+2 ))continuefi# Replace Solaris consolidation package entry with DM# $SI_SYSPKG variable, which resolves to a pkg entry# that is the same release of Solaris as that running# on the clientif [ ${pkgname} == ${SI_SYSPKG} ]; thenecho "Solaris consolidation manifest entry is" \"$SI_SYSPKG"breakfiecho "Replacing Solaris consolidation manifest entry" \"'${pkgname}' with '${SI_SYSPKG}'"trap error_handler ERR/usr/bin/aimanifest set ${array[idx+1]} $SI_SYSPKGtrap - ERRbreakdoneif [ $idx -ge ${#array[@]} ]; thenecho "Warning: Manifest does not contain package, '$SOLPKG'"echo "Unable to constrain version of Solaris"fi}######################################### main######################################### Create xml tmp file, then use aimanifest(1M) to load the# file and update the Solaris version to install.if [ -z "$TMPFILE" ]; then echo "Error: Unable to create temporary manifest file" exit 1ficreate_xml_fileload_xmlupdate_solaris_versionexit $SCRIPT_SUCCESSroot@ai:/export/ai/manifests# This is a long file, which is a shell script around a template XML file. This is called a derived manifest. I will not go into the details of that here. Trust me that for any customization, if you leave this as a derived manifest, any customizations you make will not get applied.I edit it to remove the scripting wrapper. And I also change section to point to my repository at 10.143.156.62. Here you can see the deltas between the original file and my new one. Don't how the removal of the script parts.root@ai:/export/ai/manifests# diff myrepo.xml orig_default.xml ...66c89< <origin name="http://192.168.1.102/"/>---> <origin name="http://pkg.oracle.com/solaris/release/"/>...root@ai:/export/ai/manifests# Just in case I mess up, I create an alternate boot environment so I can revert to this configuration.root@ai:/export/ai/manifests# beadm create base-ai-no-manifestsroot@ai:/export/ai/manifests# I load my new manifest into the system, with a name I will recognize ("myrepo") and make it the default using the -d option.root@ai:/export/ai/manifests# installadm create-manifest -n default-sparc -m myrepo \-f ./myrepo.xml -dCreated Manifest: 'myrepo'root@ai:/export/ai/manifests# root@ai:/export/ai/manifests# installadm list Service Name Status Arch Type Secure Alias Aliases Clients Profiles Manifests------------ ------ ---- ---- ------ ----- ------- ------- -------- ---------default-sparc on sparc iso no yes 0 0 0 2 solaris11_3_6-sparc_1 on sparc iso no no 1 0 0 1 root@ai:/export/ai/manifests# root@ai:/export/ai/manifests# installadm list -mService Name Manifest Name Type Status Criteria------------ ------------- ---- ------ --------default-sparc myrepo xml default none orig_default derived inactive none solaris11_3_6-sparc_1 orig_default derived default none root@ai:/export/ai/manifests# You can see a third manifest, and also that it is the default.Lets try installing the system again. {0} ok boot net - installBoot device: /virtual-devices@100/channel-devices@200/network@1 File and args: - install<time unavailable> wanboot info: WAN boot messages->console<time unavailable> wanboot info: configuring /virtual-devices@100/channel-devices@200/network@1<time unavailable> wanboot progress: wanbootfs: Read 368 of 368 kB (100%)<time unavailable> wanboot info: wanbootfs: Download completeFri Apr 15 12:38:56 wanboot progress: miniroot: Read 276976 of 276976 kB (100%)Fri Apr 15 12:38:56 wanboot info: miniroot: Download completeSunOS Release 5.11 Version 11.3 64-bitCopyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.Remounting root read/writeProbing for device nodes ...Preparing network image for use...Done mounting imageConfiguring devices.Hostname: host1Service discovery phase initiatedService name to look up: default-sparcService discovery over multicast DNS failedService default-sparc located at 192.168.1.102:5555 will be usedService discovery finished successfullyProcess of obtaining install manifest initiatedUsing the install manifest obtained via service discoveryhost1 console login: Automated Installation startedThe progress of the Automated Installation will be output to the consoleDetailed logging is in the logfile at /system/volatile/install_logPress RETURN to get a login prompt at any time.12:36:18 Install Log: /system/volatile/install_log12:36:18 Using XML Manifest: /system/volatile/ai.xml12:36:18 Using profile specification: /system/volatile/profile12:36:18 Using service list file: /var/run/service_list12:36:18 Starting installation.12:36:18 0% Preparing for Installation12:36:18 100% manifest-parser completed.12:36:18 100% None12:36:18 0% Preparing for Installation12:36:19 1% Preparing for Installation12:36:19 2% Preparing for Installation12:36:19 3% Preparing for Installation12:36:19 4% Preparing for Installation12:36:20 8% install-env-configuration completed.12:36:21 24% target-discovery completed.12:36:21 Selected Disk(s) : c1d012:36:21 30% target-selection completed.12:36:21 33% ai-configuration completed.12:36:21 35% var-share-dataset completed.12:36:30 36% target-instantiation completed.12:36:30 36% Beginning IPS transfer12:36:30 Creating IPS image12:36:36 Startup: Retrieving catalog 'solaris' ... Done12:36:38 Startup: Caching catalogs ... Done12:36:38 Startup: Refreshing catalog 'solaris' ... Done12:36:38 Installing packages from:12:36:38 solaris12:36:39 origin: http://192.168.1.102/ 12:36:39 Startup: Refreshing catalog 'solaris' ... Done12:36:49 Planning: Solver setup ... Done12:36:49 Planning: Running solver ... Done12:36:49 Planning: Finding local manifests ... Done12:36:50 Planning: Fetching manifests: 0/584 0% complete12:37:20 Planning: Fetching manifests: 584/584 100% complete12:37:35 Planning: Package planning ... Done12:37:36 Planning: Merging actions ... Done12:37:41 Planning: Checking for conflicting actions ... Done12:37:45 Planning: Consolidating action changes ... Done12:37:47 Planning: Evaluating mediators ... Done12:37:50 Planning: Planning completed in 71.06 seconds12:37:50 The following licenses have been accepted and not displayed.12:37:50 Please review the licenses for the following packages post-install:12:37:50 consolidation/osnet/osnet-incorporation 12:37:50 runtime/java/jre-7 12:37:51 Package licenses may be viewed using the command:12:37:51 pkg info --license <pkg_fmri>12:37:53 Download: 0/86286 items 0.0/790.5MB 0% complete 12:37:58 Download: 566/86286 items 5.5/790.5MB 0% complete (1.1M/s)12:38:03 Download: 1688/86286 items 17.4/790.5MB 2% complete (1.8M/s)...12:44:48 Download: 83211/86286 items 786.1/790.5MB 99% complete (1.1M/s)12:44:53 Download: 84792/86286 items 787.1/790.5MB 99% complete (556k/s)12:44:58 Download: 86123/86286 items 790.1/790.5MB 99% complete (402k/s)12:44:59 Download: Completed 790.54 MB in 425.67 seconds (1.8M/s)12:45:21 Actions: 1/116156 actions (Installing new actions)12:45:26 Actions: 21487/116156 actions (Installing new actions)...12:49:48 Actions: 114057/116156 actions (Installing new actions)12:49:53 Actions: 114252/116156 actions (Installing new actions)12:49:58 Actions: 114746/116156 actions (Installing new actions)12:50:02 Actions: Completed 116156 actions in 281.01 seconds.12:50:03 Done12:50:32 Version mismatch: 12:50:32 Installer build version: pkg://solaris/entire@0.5.11,5.11-0.175.3.6.0.5.0:20160309T220555Z12:50:32 Target build version: pkg://solaris/entire@0.5.11,5.11-0.175.3.5.0.6.0:20160215T165127Z12:50:32 38% generated-transfer-775-1 completed.12:50:32 40% initialize-smf completed.12:50:33 41% update-dump-adm completed.12:50:33 43% setup-swap completed.12:50:33 45% device-config completed.12:50:35 46% apply-sysconfig completed.12:50:35 48% transfer-zpool-cache completed.12:50:45 81% boot-archive completed.12:50:48 Setting boot devices in firmware12:50:48 Setting openprom boot-device12:50:48 88% boot-configuration completed.12:51:06 91% update-filesystem-owner-group completed.12:51:06 92% transfer-ai-files completed.12:51:07 100% create-snapshot completed.12:51:07 100% None12:51:07 Automated Installation succeeded.12:51:07 You may wish to reboot the system at this time.Automated Installation finished successfullyThe system can be rebooted nowPlease refer to the /system/volatile/install_log file for detailsAfter reboot it will be located at /var/log/install/install_loghost1 console login: rootPassword: Last login: Fri Apr 15 12:56:00 2016 on consoleOracle Corporation SunOS 5.11 11.3 February 2016root@host1:~# Success!! I can log into the system prior to a reboot, if I want to. At least to reboot.root@host1:~# ipadm show-addrADDROBJ TYPE STATE ADDRnet0/netboot static ok 192.168.1.201/22lo0/v4 static ok 127.0.0.1/8lo0/v6 static ok ::1/128root@host1:~# pkg list entireNAME (PUBLISHER) VERSION IFOentire 0.5.11-0.175.3.6.0.5.0 i--root@host1:~# root@host1:~# rebootApr 15 13:53:42 host1 reboot: initiated by root on /dev/consolesyncing file systems... donerebooting...Resetting...NOTICE: Entering OpenBoot.NOTICE: Fetching Guest MD from HV.NOTICE: Starting additional cpus.NOTICE: Initializing LDC services.NOTICE: Probing PCI devices.NOTICE: Finished PCI probing.SPARC T4-1, No KeyboardCopyright (c) 1998, 2015, Oracle and/or its affiliates. All rights reserved.OpenBoot 4.38.3, 4.0000 GB memory available, Serial #83545684.Ethernet address 0:14:4f:fa:ce:54, Host ID: 84face54.Boot device: /virtual-devices@100/channel-devices@200/disk@0:a File and args: SunOS Release 5.11 Version 11.3 64-bitCopyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.Loading smf(5) service descriptions: 225/225Configuring devices.Exiting System Configuration Tool. Log is available at:/system/volatile/sysconfig/sysconfig.log.319The text screen comes up at this point to ask for system configuration information. I could enter that, however, I am going to re-install shortly anyway, so there is no need. I will show how to automatically configure the system a bit later.I exited out of the System Configuration Tool with option 9 (ESC 9 on my keyboard.)SCI tool did not create SC profile, skipping the rest of interactive configuration. Interactive configuration will resume on next boot.Apr 15 13:56:25 svc.startd[13]: svc:/milestone/config:default: Method "/lib/svc/method/milestone-config start" failed with exit status 95.Apr 15 13:56:25 svc.startd[13]: milestone/config:default failed fatally: transitioned to maintenance (see 'svcs -xv' for details)Requesting System Maintenance Mode(See /lib/svc/share/README for more information.)Console login service(s) cannot runEnter user name for system maintenance (control-d to bypass): How about a quick check that my server did install correctly? That requires me to log in as root/solaris.Enter user name for system maintenance (control-d to bypass): rootEnter root password (control-d to bypass): xxx...xxxsingle-user privilege assigned to root on /dev/console.Entering System Maintenance ModeApr 15 13:56:38 su: 'su root' succeeded for root on /dev/consoleOracle CorporationSunOS 5.1111.3February 2016root@unknown:~# root@unknown:~# pkg list entireNAME (PUBLISHER) VERSION IFOentire 0.5.11-0.175.3.5.0.6.0 i--root@unknown:~# root@unknown:~# ipadm show-addrADDROBJ TYPE STATE ADDRlo0/v4 static ok 127.0.0.1/8lo0/v6 static ok ::1/128root@unknown:~# root@unknown:~# pkg list |wc -l585root@unknown:~# The system is installed to SRU 5.6, which is the more recent SRU in the repository (5.6), even though the AI image was created with 6.5. Did you notice the "Version mismatch" message towards the end of the installation?So this shows how to set up the Solaris Automated Installer to deploy Solaris over a network. This works for both bare metal and virtual machine installs, anything where the Global Zone is being installed. An advanced feature is to include Solaris Zones in the initial installation, and another topic might be to use AI to install a Solaris Zone into a system that is already running.My next step is to do some additional customization of Automated Installation. See you there.Revision History(Other than minor typographical changes)2016.10.26: Posted2016.04.29: Created

Overview The automated way of installing a Solaris Global or Non-Global Zone is the Automated Installer, which replaces Jumpstart in Solaris 10 and prior. You can use AI with the repositories Oracle...

Solaris

Repo AI UAR: Step 3: Install a System using the Automated Installer

pre { display: block; font-family: monospace; white-space: pre; margin: 1em 0; background: lightgreen;} OverviewIn the previous section we installed a system with the default configuration, just making the necessary change to use a local Solaris repository. Now lets get a bit more specific in what we install.Installing a Specific SRUIf you remember from the previous section, I modified the default manifest to set the Solaris 11 repository server. Nearby in the manifest is also the option to set the specific version of Solaris to install.root@ai:/export/ai/manifests# cp myrepo.xml host1.xml root@ai:/export/ai/manifests# root@ai:/export/ai/manifests# vi host1.xml root@ai:/export/ai/manifests# root@ai:/export/ai/manifests# diff myrepo.xml host1.xml 83c83< <name>pkg:/entire@0.5.11-0.175.3</name>---> <name>pkg:/entire@0.5.11-0.175.3.4.0.5.0</name>root@ai:/export/ai/manifests# Here I have replaced the generic Solaris 11.3 (shown as "175.3") with the more specific 4.0.5.0 SRU naming. This is SRU 4.5, even though there are four numbers instead of two. In some cases the trailing 0 has been incremented, because a quick update came out within the month, usually within a week of the release of the original SRU. For more details go to the Oracle Solaris 11.3 Support Repository Updates (SRU) Index (Doc ID 2045311.1). There is a separate MOS document for each "dot" release of Solaris 11.Although I was complete in the version of Solaris to install, I only need to add enough to be unique. I could use pkg:/entire@0.5.11-0.175.3.4, and since there is only one version of 4.x.x.x, the 4.0.5.0 version would be installed. You would know before 5.x comes out whether there is a 4.0.5.1. In this case I would be safe using a shorter value.To make it easier for me to track this, I associate this new manifest with a specific criteria, using the "-c option to installadm. There are a number of different criteria to select a manifest with, including hostname, IPv4 address, and MAC address. For the full list, see the Criteria section of installadm(1M) section of the Solaris 11.3 documentation.root@ai:/export/ai/manifests# installadm create-manifest -n default-sparc -m host1 \-f ./host1.xml -c hostname=host1 Created Manifest: 'host1'root@ai:/export/ai/manifests# root@ai:/export/ai/manifests# installadm list -m Service Name Manifest Name Type Status Criteria ------------ ------------- ---- ------ -------- default-sparc host1 xml active hostname = host1 myrepo xml default none orig_default derived inactive none solaris11_3_6-sparc_1 orig_default derived default none root@ai:/export/ai/manifests# You can see here a new manifest in the list and how it gets selected.Off so see how it works.root@host1:~# reboot -- net - install Apr 15 13:03:18 host1 reboot: initiated by root on /dev/consoleApr 15 13:03:24 host1 syslogd: going down on signal 15syncing file systems... donerebooting...Resetting...NOTICE: Entering OpenBoot.NOTICE: Fetching Guest MD from HV.NOTICE: Starting additional cpus.NOTICE: Initializing LDC services.NOTICE: Probing PCI devices.NOTICE: Finished PCI probing.SPARC T4-1, No KeyboardCopyright (c) 1998, 2015, Oracle and/or its affiliates. All rights reserved.OpenBoot 4.38.3, 4.0000 GB memory available, Serial #83545684.Ethernet address 0:14:4f:fa:ce:54, Host ID: 84face54.Boot device: /virtual-devices@100/channel-devices@200/network@1 File and args: - install<time unavailable> wanboot info: WAN boot messages->console<time unavailable> wanboot info: configuring /virtual-devices@100/channel-devices@200/network@1<time unavailable> wanboot progress: wanbootfs: Read 368 of 368 kB (100%)<time unavailable> wanboot info: wanbootfs: Download completeFri Apr 15 13:09:21 wanboot progress: miniroot: Read 276976 of 276976 kB (100%)Fri Apr 15 13:09:21 wanboot info: miniroot: Download completeSunOS Release 5.11 Version 11.3 64-bitCopyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.Remounting root read/writeProbing for device nodes ...Preparing network image for useDownloading solaris.zlib...Done mounting imageConfiguring devices.Hostname: host1Service discovery phase initiatedService name to look up: default-sparcService discovery over multicast DNS failedService default-sparc located at 192.168.1.102:5555 will be usedService discovery finished successfullyProcess of obtaining install manifest initiatedUsing the install manifest obtained via service discoveryhost1 console login: Automated Installation startedThe progress of the Automated Installation will be output to the consoleDetailed logging is in the logfile at /system/volatile/install_logPress RETURN to get a login prompt at any time.13:06:43 Install Log: /system/volatile/install_log13:06:43 Using XML Manifest: /system/volatile/ai.xml...13:07:04 Installing packages from:13:07:04 solaris13:07:04 origin: http://192.168.1.102/13:07:04 Startup: Refreshing catalog 'solaris' ... Done13:07:10 Planning: Solver setup ... Done...13:08:11 Package licenses may be viewed using the command:13:08:11 pkg info --license 13:08:13 Download: 0/85958 items 0.0/785.6MB 0% complete 13:08:18 Download: 926/85958 items 6.9/785.6MB 0% complete (1.4M/s)13:08:23 Download: 2285/85958 items 21.6/785.6MB 2% complete (2.2M/s)...13:13:49 Download: 85582/85958 items 784.3/785.6MB 99% complete (304k/s)13:13:50 Download: Completed 785.64 MB in 337.02 seconds (2.3M/s)13:14:12 Actions: 1/115694 actions (Installing new actions)13:14:17 Actions: 21460/115694 actions (Installing new actions)...13:18:41 Actions: 113786/115694 actions (Installing new actions)13:18:46 Actions: 114229/115694 actions (Installing new actions)13:18:48 Actions: Completed 115694 actions in 275.64 seconds.13:18:49 Done13:19:18 Version mismatch: 13:19:18 Installer build version: pkg://solaris/entire@0.5.11,5.11-0.175.3.6.0.5.0:20160309T220555Z13:19:18 Target build version: pkg://solaris/entire@0.5.11,5.11-0.175.3.4.0.5.0:20160114T213235Z13:19:18 38% generated-transfer-775-1 completed....13:19:51 100% create-snapshot completed.13:19:51 100% None13:19:51 Automated Installation succeeded.13:19:51 You may wish to reboot the system at this time.Automated Installation finished successfullyThe system can be rebooted nowPlease refer to the /system/volatile/install_log file for detailsAfter reboot it will be located at /var/log/install/install_loghost1 console login: host1 console login: root Password: Apr 15 13:21:32 host1 login: ROOT LOGIN /dev/consoleOracle Corporation SunOS 5.11 11.3 February 2016root@host1:~# root@host1:~# reboot Apr 15 13:21:35 host1 reboot: initiated by root on /dev/consolesyncing file systems... donerebooting...Resetting...NOTICE: Entering OpenBoot.NOTICE: Fetching Guest MD from HV.NOTICE: Starting additional cpus.NOTICE: Initializing LDC services.NOTICE: Probing PCI devices.NOTICE: Finished PCI probing.SPARC T4-1, No KeyboardCopyright (c) 1998, 2015, Oracle and/or its affiliates. All rights reserved.OpenBoot 4.38.3, 4.0000 GB memory available, Serial #83545684.Ethernet address 0:14:4f:fa:ce:54, Host ID: 84face54.Boot device: /virtual-devices@100/channel-devices@200/disk@0:a File and args: SunOS Release 5.11 Version 11.3 64-bitCopyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.Loading smf(5) service descriptions: 225/225Configuring devices.SC profile successfully generated as:/etc/svc/profile/sysconfig/sysconfig-20160415-132338/sc_profile.xmlExiting System Configuration Tool. Log is available at:/system/volatile/sysconfig/sysconfig.log.308Loading smf(5) service descriptions: 2/2Hostname: host1host1 console login: host1 console login: guest Password: Oracle Corporation SunOS 5.11 11.3 December 2015guest@host1:~$ guest@host1:~$ pkg list entire NAME (PUBLISHER) VERSION IFOentire 0.5.11-0.175.3.4.0.5.0 i--guest@host1:~$ During the boot process I had to enter system details into the System Configuration Tool. Since I'll be doing installs multiple times, I want to create a Profile for this system so that it is applied automatically.Creating a Profile to Answer the System Configuration Tool QuestionsI typically create the System Configuration file using the sysconfig(1M) tool for configuring my Solaris Zones. While I haven't used it with Zones, I find it super easy to use System Configuration Profile Templates. Thus one Profile can work for many systems. The full list of variable is at http://docs.oracle.com/cd/E53394_01/html/E54756/glhwo.html#scrolltoc.In the file below I show using Template Variables for the hostname and the three main part of the network configuration, IP address, netmask, and default router or gateway. I am not configuring a name service, and I do set my timezone and my default user and root passwords.root@ai:~# cat /export/ai/profiles/myshared_profile.xml <?xml version='1.0' encoding='US-ASCII'?><!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1"><!-- Auto-generated by sysconfig --><!-- Modified 20160415 by Steffen --><service_bundle type="profile" name="sysconfig"> <service version="1" type="service" name="system/identity"> <instance enabled="true" name="node"> <property_group type="application" name="config"> <propval type="astring" name="nodename" value="{{AI_HOSTNAME }}"/> </property_group> </instance> </service> <service version="1" type="service" name="network/install"> <instance enabled="true" name="default"> <property_group type="application" name="install_ipv6_interface"> <propval type="astring" name="stateful" value="yes"/> <propval type="astring" name="address_type" value="addrconf"/> <propval type="astring" name="name" value="net0/v6"/> <propval type="astring" name="stateless" value="yes"/> </property_group> <property_group type="application" name="install_ipv4_interface"> <propval type="net_address_v4" name="static_address" value="{{AI_IPV4}}/{{AI_IPV4_PREFIXLEN}}"/> <propval type="astring" name="name" value="net0/v4"/> <propval type="astring" name="address_type" value="static"/> <propval type="net_address_v4" name="default_route" value="{{AI_ROUTER}}"/> </property_group> </instance> </service> <service version="1" type="service" name="network/physical"> <instance enabled="true" name="default"> <property_group type="application" name="netcfg"> <propval type="astring" name="active_ncp" value="DefaultFixed"/> </property_group> </instance> </service> <service version="1" type="service" name="system/name-service/switch"> <property_group type="application" name="config"> <propval type="astring" name="default" value="files"/> </property_group> <instance enabled="true" name="default"/> </service> <service version="1" type="service" name="system/name-service/cache"> <instance enabled="true" name="default"/> </service> <service version="1" type="service" name="system/keymap"> <instance enabled="true" name="default"> <property_group type="application" name="keymap"> <propval type="astring" name="layout" value="US-English"/> </property_group> </instance> </service> <service version="1" type="service" name="system/timezone"> <instance enabled="true" name="default"> <property_group type="application" name="timezone"> <propval type="astring" name="localtime" value="US/Eastern"/> </property_group> </instance> </service> <service version="1" type="service" name="system/environment"> <instance enabled="true" name="init"> <property_group type="application" name="environment"> <propval type="astring" name="LANG" value="en_US.UTF-8"/> </property_group> </instance> </service> <service version="1" type="service" name="system/config-user"> <instance enabled="true" name="default"> <property_group type="application" name="root_account"> <propval type="astring" name="type" value="role"/> <propval type="astring" name="login" value="root"/> <propval type="astring" name="password" value="$5$VeOASCUS$dqjdK9F23sDXDyRUlD6Rg79qidQvN1Ga32klr3lMOp0"/> </property_group> <property_group type="application" name="user_account"> <propval type="astring" name="roles" value="root"/> <propval type="astring" name="shell" value="/usr/bin/bash"/> <propval type="astring" name="login" value="guest"/> <propval type="astring" name="password" value="$5$bzSydQfv$BKdFfMEBgrdcEleQFT.9BEj9NXJ0n004e0ltJCp3m16"/> <propval type="astring" name="type" value="normal"/> <propval type="astring" name="sudoers" value="ALL=(ALL) ALL"/> <propval type="count" name="gid" value="10"/> <propval type="astring" name="description" value="guest"/> <propval type="astring" name="profiles" value="System Administrator"/> </property_group> </instance> </service> <service version="1" type="service" name="system/ocm"> <instance enabled="true" name="default"> <property_group type="application" name="reg"> <propval type="astring" name="user" value="anonymous@oracle.com"/> </property_group> </instance> </service> <service version="1" type="service" name="system/fm/asr-notify"> <instance enabled="true" name="default"> <property_group type="application" name="autoreg"> <propval type="astring" name="user" value="anonymous@oracle.com"/> </property_group> </instance> </service></service_bundle>root@ai:~# I will verify that I have no existing Profiles. Then I will add this new one.root@ai:~# installadm list -p There are no profiles configured for local services.root@ai:~# root@ai:~# installadm create-profile -n default-sparc -p myshared \-f /export/ai/profiles/myshared_profile.xml Created Profile: 'myshared'root@ai:~# root@ai:~# installadm list -p Service Name Profile Name Environment Criteria------------ ------------ ----------- --------default-sparc myshared system none root@ai:~# The best, and maybe only way to test is to run a new installation.root@unknown:~# reboot -- net - install Apr 15 16:39:52 svc.startd[13]: svc.configd exited with exit status 0. Reason: unknown error.svc.configd exited with exit status 0. Reason: unknown error.updating /platform/sun4v/boot_archivesyncing file systems... donerebooting...Resetting...NOTICE: Entering OpenBoot.NOTICE: Fetching Guest MD from HV.NOTICE: Starting additional cpus.NOTICE: Initializing LDC services.NOTICE: Probing PCI devices.NOTICE: Finished PCI probing.SPARC T4-1, No KeyboardCopyright (c) 1998, 2015, Oracle and/or its affiliates. All rights reserved.OpenBoot 4.38.3, 4.0000 GB memory available, Serial #83545684.Ethernet address 0:14:4f:fa:ce:54, Host ID: 84face54.Boot device: /virtual-devices@100/channel-devices@200/network@1 File and args: - install<time unavailable> wanboot info: WAN boot messages->console<time unavailable> wanboot info: configuring /virtual-devices@100/channel-devices@200/network@1<time unavailable> wanboot progress: wanbootfs: Read 368 of 368 kB (100%)<time unavailable> wanboot info: wanbootfs: Download completeFri Apr 15 16:46:26 wanboot progress: miniroot: Read 276976 of 276976 kB (100%)Fri Apr 15 16:46:26 wanboot info: miniroot: Download completeSunOS Release 5.11 Version 11.3 64-bitCopyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.Remounting root read/writeProbing for device nodes ...Preparing network image for useDownloading solaris.zlib...Done mounting imageConfiguring devices.Hostname: host1Service discovery phase initiatedService name to look up: default-sparcService discovery over multicast DNS failedService default-sparc located at 192.168.1.102:5555 will be usedService discovery finished successfullyProcess of obtaining install manifest initiatedUsing the install manifest obtained via service discoveryAutomated Installation startedThe progress of the Automated Installation will be output to the consoleDetailed logging is in the logfile at /system/volatile/install_logPress RETURN to get a login prompt at any time.host1 console login: 16:43:28 Install Log: /system/volatile/install_log16:43:28 Using XML Manifest: /system/volatile/ai.xml16:43:28 Using profile specification: /system/volatile/profile...16:52:24 Actions: 114257/116156 actions (Installing new actions)16:52:29 Actions: 114756/116156 actions (Installing new actions)16:52:31 Actions: Completed 116156 actions in 286.15 seconds.16:52:32 Done16:53:01 Version mismatch: 16:53:01 Installer build version: pkg://solaris/entire@0.5.11,5.11-0.175.3.6.0.5.0:20160309T220555Z16:53:01 Target build version: pkg://solaris/entire@0.5.11,5.11-0.175.3.5.0.6.0:20160215T165127Z16:53:01 38% generated-transfer-784-1 completed.16:53:01 40% initialize-smf completed....16:53:32 100% create-snapshot completed.16:53:32 100% None16:53:32 Automated Installation succeeded.16:53:32 You may wish to reboot the system at this time.Automated Installation finished successfullyThe system can be rebooted nowPlease refer to the /system/volatile/install_log file for detailsAfter reboot it will be located at /var/log/install/install_loghost1 console login: host1 console login: root Password: Apr 15 17:01:47 host1 login: ROOT LOGIN /dev/consoleOracle Corporation SunOS 5.11 11.3 February 2016root@host1:~# Note that the profile has not yet been added. If I were to log in, I would still see the values derived from the OBP network-boot-arguments settings.root@host1:~# reboot Apr 15 17:02:30 host1 reboot: initiated by root on /dev/consolesyncing file systems... donerebooting...Resetting...NOTICE: Entering OpenBoot.NOTICE: Fetching Guest MD from HV.NOTICE: Starting additional cpus.NOTICE: Initializing LDC services.NOTICE: Probing PCI devices.NOTICE: Finished PCI probing.SPARC T4-1, No KeyboardCopyright (c) 1998, 2015, Oracle and/or its affiliates. All rights reserved.OpenBoot 4.38.3, 4.0000 GB memory available, Serial #83545684.Ethernet address 0:14:4f:fa:ce:54, Host ID: 84face54.Boot device: /virtual-devices@100/channel-devices@200/disk@0:a File and args: SunOS Release 5.11 Version 11.3 64-bitCopyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.Loading smf(5) service descriptions: 225/225Configuring devices.Loading smf(5) service descriptions: 2/2Hostname: host1Apr 15 13:05:32 host1 sendmail[1079]: My unqualified host name (host1) unknown; sleeping for retryApr 15 13:05:33 host1 sendmail[1098]: My unqualified host name (host1) unknown; sleeping for retryhost1 console login: guest Password: Oracle Corporation SunOS 5.11 11.3 February 2016guest@host1:~$ With this I have shown how to set up an installation and the basic configuration options for the System Configuration Profile tool. This will allow you to do a fully automated installation.Some Examples of Additional AI CustomizationOnce we deviate from a default installation the number of combinations gets large, and I don't plan on covering many of those options. I do want to show a few. The one that comes to mind first is a smaller installation. That is often desirable for security and operational efficiencies. The default installation Group is for solaris-large-server. Two other smaller Groups are solaris-small-server and solaris-minimal-server. Solaris Zones typically are solaris-small-server. I will create am manifest to install the minimal server, and show how to add one package. I am picking minimal to make installation faster and because the number of packages is so much smaller than the large server. For a description of all the Groups and the packages each includes, go to Oracle Solaris 11.3 Package Group Lists at http://docs.oracle.com/cd/E53394_01/html/E54814/index.html.I will use the host1 manifest as my staring point. While the package is listed as install/installadm, I need the package's FRMI. That turns out to be pkg://solaris/install/installadm. I must add that to the manifest.root@ai2:/export/ai/manifests# pkg info install/installadm Name: install/installadm Summary: installadm utility Description: Automatic Installation Server Setup Tools Category: System/Administration and Configuration State: Installed Publisher: solaris Version: 0.5.11 Build Release: 5.11 Branch: 0.175.3.5.0.2.0Packaging Date: January 19, 2016 06:58:03 PM Size: 1.85 MB FMRI: pkg://solaris/install/installadm@0.5.11,5.11-0.175.3.5.0.2.0:20160119T185803Zroot@ai:/export/ai/manifests# root@ai:/export/ai/manifests# vi host1.xml root@ai:/export/ai/manifests# root@ai:/export/ai/manifests# diff host1.xml Host1.xml.large 84c84< <name>pkg:/group/system/solaris-minimal-server---> <name>pkg:/group/system/solaris-large-server86,88d85< <software_data>< <name>pkg://solaris/install/installadm < </software_data>root@ai:/export/ai/manifests# You can see that I changed from large to small server, and I am adding a single package, as an example. Since the Image Packaging System handles bringing in dependencies, I don't have to think about what other packages install/installadm might require. That will be taken care of automatically.The other change I would like to make now is to have the system automatically reboot. The total differences are small.root@ai:/export/ai/manifests# diff host1.xml Host1.xml.large 3c3< <ai_instance name="host1" auto_reboot="true" >---> <ai_instance name="default">84c84< <name>pkg:/group/system/solaris-minimal-server---> <name>pkg:/group/system/solaris-large-server86,88d85< <software_data>< <name>pkg://solaris/install/installadm< </software_data>The changes are:Minimal install instead of large serverAutomatically reboot after the installation is successfulI will highlight a few things to compare with previous installations.root@host1:~# reboot -- net - install Apr 28 04:25:02 host1 reboot: initiated by root on /dev/consoleApr 28 04:25:09 host1 syslogd: going down on signal 15syncing file systems... donerebooting...Resetting...NOTICE: Entering OpenBoot.NOTICE: Fetching Guest MD from HV.NOTICE: Starting additional cpus.NOTICE: Initializing LDC services.NOTICE: Probing PCI devices.NOTICE: Finished PCI probing.SPARC T4-1, No KeyboardCopyright (c) 1998, 2015, Oracle and/or its affiliates. All rights reserved.OpenBoot 4.38.3, 4.0000 GB memory available, Serial #83545684.Ethernet address 0:14:4f:fa:ce:54, Host ID: 84face54.Boot device: /virtual-devices@100/channel-devices@200/network@0 File and args: - install<time unavailable> wanboot info: WAN boot messages->console<time unavailable> wanboot info: configuring /virtual-devices@100/channel-devices@200/network@0<time unavailable> wanboot progress: wanbootfs: Read 368 of 368 kB (100%)<time unavailable> wanboot info: wanbootfs: Download completeThu Apr 28 04:31:30 wanboot progress: miniroot: Read 276976 of 276976 kB (100%)Thu Apr 28 04:31:30 wanboot info: miniroot: Download completeSunOS Release 5.11 Version 11.3 64-bitCopyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.Remounting root read/writeProbing for device nodes ...Preparing network image for useDownloading solaris.zlib...Done mounting imageConfiguring devices.Hostname: host1Service discovery phase initiatedService name to look up: default-sparcService discovery over multicast DNS failedService default-sparc located at 192.168.1.102:5555 will be usedService discovery finished successfullyProcess of obtaining install manifest initiatedUsing the install manifest obtained via service discoveryAutomated Installation startedThe progress of the Automated Installation will be output to the consoleDetailed logging is in the logfile at /system/volatile/install_logPress RETURN to get a login prompt at any time.host1 console login: 04:28:32 Install Log: /system/volatile/install_log04:28:32 Using XML Manifest: /system/volatile/ai.xml04:28:32 Using profile specification: /system/volatile/profile04:28:32 Using service list file: /var/run/service_list04:28:32 Starting installation.04:28:32 0% Preparing for Installation...04:28:44 36% Beginning IPS transfer04:28:44 Creating IPS image04:28:50 Startup: Retrieving catalog 'solaris' ... Done04:28:53 Startup: Caching catalogs ... Done04:28:53 Startup: Refreshing catalog 'solaris' ... Done04:28:53 Installing packages from:04:28:53 solaris04:28:53 origin: http://192.168.1.102/04:28:53 Startup: Refreshing catalog 'solaris' ... Done04:28:58 Planning: Solver setup ... Done04:28:58 Planning: Running solver ... Done04:28:58 Planning: Finding local manifests ... Done04:28:59 Planning: Fetching manifests: 0/324 0% complete04:29:15 Planning: Fetching manifests: 324/324 100% complete04:29:23 Planning: Package planning ... Done04:29:24 Planning: Merging actions ... Done04:29:26 Planning: Checking for conflicting actions ... Done04:29:28 Planning: Consolidating action changes ... Done04:29:29 Planning: Evaluating mediators ... Done04:29:31 Planning: Planning completed in 37.26 seconds04:29:31 The following licenses have been accepted and not displayed.04:29:31 Please review the licenses for the following packages post-install:04:29:31 consolidation/osnet/osnet-incorporation 04:29:31 Package licenses may be viewed using the command:04:29:31 pkg info --license 04:29:33 Download: 0/43188 items 0.0/328.9MB 0% complete 04:29:38 Download: 2535/43188 items 21.3/328.9MB 6% complete (4.3M/s)04:29:43 Download: 5667/43188 items 51.1/328.9MB 15% complete (5.1M/s)...04:30:33 Download: 39576/43188 items 292.8/328.9MB 89% complete (5.7M/s)04:30:38 Download: 42489/43188 items 324.3/328.9MB 98% complete (5.9M/s)04:30:39 Download: Completed 328.90 MB in 66.65 seconds (4.9M/s)04:30:51 Actions: 1/61111 actions (Installing new actions)04:30:56 Actions: 16029/61111 actions (Installing new actions)...04:31:46 Actions: 59856/61111 actions (Installing new actions)04:31:51 Actions: 60028/61111 actions (Installing new actions)04:31:55 Actions: Completed 61111 actions in 64.68 seconds.04:31:56 Done04:32:11 Installing packages from:04:32:11 solaris04:32:11 origin: http://10.143.156.62/04:32:12 Startup: Refreshing catalog 'solaris' ... Done04:32:16 Planning: Solver setup ... Done...04:32:20 Planning: Evaluating mediators ... Done04:32:20 Planning: Planning completed in 7.71 seconds04:32:20 Download: 0/185 items 0.0/3.0MB 0% complete 04:32:21 Download: Completed 3.02 MB in 0.49 seconds (6.1M/s)04:32:22 Actions: 1/317 actions (Installing new actions)04:32:22 Actions: Completed 317 actions in 0.36 seconds.04:32:22 Done04:32:34 Version mismatch: 04:32:34 Installer build version: pkg://solaris/entire@0.5.11,5.11-0.175.3.6.0.5.0:20160309T220555Z04:32:34 Target build version: pkg://solaris/entire@0.5.11,5.11-0.175.3.4.0.5.0:20160114T213235Z04:32:34 38% generated-transfer-773-1 completed....04:32:51 100% create-snapshot completed.04:32:51 100% None04:32:51 Automated Installation succeeded.04:32:51 System will be rebooted nowAutomated Installation finished successfullyAuto reboot enabled. The system will be rebooted now Log files will be available in /var/log/install/ after rebootApr 28 04:32:54 host1 reboot: initiated by root syncing file systems... donerebooting...Resetting...NOTICE: Entering OpenBoot.NOTICE: Fetching Guest MD from HV.NOTICE: Starting additional cpus.NOTICE: Initializing LDC services.NOTICE: Probing PCI devices.NOTICE: Finished PCI probing.SPARC T4-1, No KeyboardCopyright (c) 1998, 2015, Oracle and/or its affiliates. All rights reserved.OpenBoot 4.38.3, 4.0000 GB memory available, Serial #83545684.Ethernet address 0:14:4f:fa:ce:54, Host ID: 84face54.Boot device: /virtual-devices@100/channel-devices@200/disk@0:a File and args: -Z rpool/ROOT/solarisSunOS Release 5.11 Version 11.3 64-bitCopyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.Loading smf(5) service descriptions: 142/142Configuring devices.Hostname: host1Unable to set localehost1 console login: guest Password: Oracle Corporation SunOS 5.11 11.3 December 2015guest@host1:~$ guest@host1:~$ pkg list install/installadm pkg: Unable to set locale 'en_US.UTF-8'; locale package may be broken ornot installed. Reverting to C locale.NAME (PUBLISHER) VERSION IFOinstall/installadm 0.5.11-0.175.3.0.0.30.0 i--guest@host1:~$ guest@host1:~$ pkg list|wc -l pkg: Unable to set locale 'en_US.UTF-8'; locale package may be broken ornot installed. Reverting to C locale. 328guest@host1:~$ I would like to highlight some items now.You will see this is a much smaller and shorter installation. Only 43,188 items of 329MB are installed, in approximately 66 seconds. If you look back at a previous installation above, 85,958 items took 337 seconds to install. With that, the number of actions performed goes from 15,694 to 6,111, and the number of SMF services that get processed on first boot drops from 225 to 142. All signs of the smaller installation.This is also visible by looking at the number of packages listed in the output of pkg list. It is only 328, which down from 585 in STEP 2 (you may have wondered why I showed that output at the time.)One downside of a reduced install is that some typical packages may not be available, such as my default locale. I am curious if I can quickly address that. I will update my profile and re-install.root@ai:/export/ai/profiles# diff myshared_profile.xml C_myshared_profile.xml 62c62< <propval type="astring" name="LANG" value="en_US.UTF-8"/>---> <propval type="astring" name="LANG" value="C"/>root@ai:/export/ai/profiles# root@ai:/export/ai/profiles# installadm list -p Service Name Profile Name Environment Criteria------------ ------------ ----------- --------default-sparc myshared system none root@ai:/export/ai/profiles# root@ai:/export/ai/profiles# installadm update-profile -n default-sparc -p myshared -f ./C_myshared_profile.xml Changed Profile: 'myshared'root@ai:/export/ai/profiles# I reinstall my system.For simplicity, I am showing only limited output, since it is almost the same as all the above.root@host1:~# reboot -- net - install Apr 29 09:46:22 host1 reboot: initiated by guest on /dev/consolehost1 console login: updating /platform/sun4v/boot_archiveUnable to set localeApr 29 09:46:37 host1 nwamd[1610]: 8: configure_fixed_thread_0: could not enable interface net0: Operation failedsyncing file systems... donerebooting.........Boot device: /virtual-devices@100/channel-devices@200/disk@0:a File and args: -Z rpool/ROOT/solarisSunOS Release 5.11 Version 11.3 64-bitCopyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.Loading smf(5) service descriptions: 142/142Configuring devices.Hostname: host1host1 console login: guest Password: Oracle Corporation SunOS 5.11 11.3 December 2015guest@host1:~$ guest@host1:~$ guest@host1:~$ pkg list install/installadm NAME (PUBLISHER) VERSION IFOinstall/installadm 0.5.11-0.175.3.0.0.30.0 i--guest@host1:~$ What is different? No "Unable to set locale" error messages.With this I show some simples forms of a custom AI installation. Other possible customizations include:Selecting which disk to install on Creating a mirrored ZFS root pool Creating a second ZFS pool Deploying a Zone I might come back to some of those later. At this time, I really want to get to the next step of creating and deploying a Unified Archive. See you there.SteffenRevision History(Other than minor typographical changes)2016.10.16: Posted2016.04.29: Created

Overview In the previous section we installed a system with the default configuration, just making the necessary change to use a local Solaris repository. Now lets get a bit more specific in what...

Solaris

Repo AI UAR: Step 4: Create and Deploy a Clone Archive

pre { display: block; font-family: monospace; white-space: pre; margin: 1em 0; background: lightgreen;} OverviewNow that we can deploy default or customized Solaris 11 configurations that are built during the installation, let's move to the next step to create a custom installation that can be re-deployed repeatedly with little effort. Solaris 10 and prior have Flash Archives for that. Solaris 11.2 provides Unified Archives, which offers a lot of features that Flash Archives does not. I will go into some of those later.Creating a Place to Put the Unified ArchivesXXXXXXXI am going to host the Unified Archives on my AI server. To make creating the UARs simpler, I will create an NFS share where the system can write its UARs. Under different circumstances, I might write them to local storage, and them copy them to the location that will host them for deployment access.During the process of creating the Automated Install server, a web server configuration is created. I will use that configuration to host the archives. By looking at "/var/ai/ai-webserver/ai-httpd.conf" I see that the document root is "/var/ai/image-server/images/" looking at ps(1) output and the httpd processes' arguments.I create a subdirectory in the "DocumentRoot and share that for NFS client access. I prefer that as then I don't have to think about having enough space on the system where I am creating the archive.root@ai:~# grep DocumentRoot /var/ai/ai-webserver/ai-httpd.conf # DocumentRoot: The directory out of which you will serve yourDocumentRoot "/var/ai/image-server/images" # This should be changed to whatever you set DocumentRoot to. # access content that does not live under the DocumentRoot.root@ai:~# root@ai:~# mkdir /var/ai/image-server/images/uar.d root@ai:~# mkdir /var/ai/image-server/images/incoming.d root@ai:~# chmod 777 /var/ai/image-server/images/incoming.d root@ai:~# share /var/ai/image-server/images/incoming.d root@ai:~# root@ai:~# share var_ai_image-server_images_incoming.d /var/ai/image-server/images/incoming.d nfs sec=sys,rwroot@ai:~# The intention is to write archives to "incoming.d" and then move them to uar.d when I am ready to make them available to deploy. This allows me to control the process.Creating a Clone Unified ArchiveThere are two types of archives, clone and recovery. For the purpose of quickly deploying new systems, the default clone archive is what I want to use.root@host1:~# archiveadm create /net/ai/var/ai/image-server/images/incoming.d/s11.3sru4.5.uar -s Initializing Unified Archive creation resources...Unified Archive initialized: /net/ai/var/ai/image-server/images/incoming.d/s11.3sru4.5.uarLogging to: /system/volatile/archive_log.1318Executing dataset discovery...Dataset discovery completeCreating install media for zone(s)...Media creation completePreparing archive system image...Beginning archive stream creation...Archive stream creation completeBeginning final archive assembly...Archive creation completeroot@host1:~# root@host1:~# archiveadm info /net/ai/var/ai/image-server/images/incoming.d/s11.3sru4.5.uar Archive Information Creation Time: 2016-04-30T00:00:17Z Source Host: host1 Architecture: sparc Operating System: Oracle Solaris 11.3 SPARC Deployable Systems: globalroot@host1:~# Pretty simple, don't you think?! Now to the AI server to configure a manifest to install this UAR.Deploying a Clone Unified ArchiveOn the AI server I need a different manifest, that directs install to use an archive, not install packages.root@ai:/export/ai/manifests# cat s11.3.5uar.xml <!DOCTYPE auto_install SYSTEM "file:///usr/share/install/ai.dtd.1"><auto_install> <ai_instance name="s11.3.5uar" auto_reboot="true"> <target name="desired"> <logical> <zpool name="rpool" is_root="true"> </zpool> </logical> </target> <software type="ARCHIVE"> <source> <file uri="http://XX.XX.XX.XX:5555/uar.d/s11.3sru4.5.uar"> </file> </source> <software_data action="install"> <name>global </software_data> </software> </ai_instance></auto_install>root@ai:/export/ai/manifests# If there is no existing "host1" manifest, create one.root@ai:/export/ai/manifests# installadm create-manifest -n default-sparc -m host1 \-f ./s11.3.4uar.xml -c hostname=host1 Created Manifest: 'host1'root@ai:~# If there is one, update it.root@ai:/export/ai/manifests# installadm list -m Service Name Manifest Name Type Status Criteria ------------ ------------- ---- ------ -------- default-sparc host1 xml active hostname = host1 myrepo xml default none orig_default derived inactive none solaris11_3_6-sparc orig_default derived default none root@ai:/export/ai/manifests# root@ai:/export/ai/manifests# installadm update-manifest -n default-sparc -m host1 \-f ./s11.3.4uar.xml Changed Manifest: 'host1'root@ai:/export/ai/manifests# I must remember to move the archive from my incoming directory to the UAR directory. A bit cumbersome, so I might rethink this.root@ai:/export/ai/manifests# mv /var/ai/image-server/images/incoming.d/s11.3sru4.5.uar \/var/ai/image-server/images/uar.d/ root@ai:/export/ai/manifests# root@host1:~# reboot -- net - install Apr 30 00:48:53 host1 reboot: initiated by root on /dev/consolesyncing file systems... donerebooting......Done mounting imageConfiguring devices.Hostname: host1Service discovery phase initiatedService name to look up: default-sparcService discovery over multicast DNS failedService default-sparc located at 192.168.1.102:5555 will be usedService discovery finished successfullyProcess of obtaining install manifest initiatedUsing the install manifest obtained via service discoveryhost1 console login: Automated Installation startedThe progress of the Automated Installation will be output to the consoleDetailed logging is in the logfile at /system/volatile/install_logPress RETURN to get a login prompt at any time.00:52:16 Install Log: /system/volatile/install_log00:52:16 Using XML Manifest: /system/volatile/ai.xml...00:52:21 10% var-share-dataset completed.00:52:30 10% target-instantiation completed.00:52:30 10% Beginning archive transfer00:52:30 Commencing transfer of stream: 552a68e9-d29d-4e4c-b80c-8fe1f453e4d6-0.zfs to rpool 00:52:40 11% Transferring contents00:52:42 15% Transferring contents...00:54:18 87% Transferring contents00:54:20 89% Transferring contents00:54:21 Completed transfer of stream: '552a68e9-d29d-4e4c-b80c-8fe1f453e4d6-0.zfs' from http://10.143.156.29:5555/uar.d/s11.3sru4.5.uar00:54:24 Archive transfer completed00:54:26 89% generated-transfer-773-1 completed.00:54:26 90% apply-pkg-variant completed....00:55:07 98% cleanup-archive-install completed.00:55:08 100% create-snapshot completed.00:55:08 100% None00:55:08 Automated Installation succeeded.00:55:08 System will be rebooted nowAutomated Installation finished successfullyAuto reboot enabled. The system will be rebooted nowLog files will be available in /var/log/install/ after rebootApr 30 00:55:12 host1 reboot: initiated by rootsyncing file systems... donerebooting...Resetting...NOTICE: Entering OpenBoot.NOTICE: Fetching Guest MD from HV.NOTICE: Starting additional cpus.NOTICE: Initializing LDC services.NOTICE: Probing PCI devices.NOTICE: Finished PCI probing.SPARC T4-1, No KeyboardCopyright (c) 1998, 2015, Oracle and/or its affiliates. All rights reserved.OpenBoot 4.38.3, 4.0000 GB memory available, Serial #83545684.Ethernet address 0:14:4f:fa:ce:54, Host ID: 84face54.Boot device: /virtual-devices@100/channel-devices@200/disk@0:a File and args: -Z rpool/ROOT/solarisSunOS Release 5.11 Version 11.3 64-bitCopyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.Loading smf(5) service descriptions: 17/17 Configuring devices.Hostname: host1host1 console login: And there it is. The system is installed, and looks like it took almost 3 minutes less. And this is for solaris-small-server sized configuration. Also, on first boot, only 17 SMF profiles were loaded, compared to approximately 180 on an install. So much of the state of the system is in the archive.Behind the scenes I switched back from the minimal configuration to small to get a more functional system. Trying to a "/net/<hostname>/" does not work with the minimal installation. I did not take the time to figure out which package or service is missing there.What I find nice now is the ability to quickly re-install an image of any of the three Package Groups. I just have to update the manifest with the file indicating the Package Group.Customizing the Unified ArchiveHow do I demonstrate that I can deploy customizations? I don't have any non-Solaris packages or applications handy, so the best I can do is add and modify some files.root@host1:~# diff /etc/ssh/sshd_config /etc/ssh/sshd_config.orig 102,104d101%lt %lt # Force KeepAlive messages%lt ClientAliveInterval 15root@host1:~# root@host1:~# cat /etc/hosts2add 192.168.1.1 myrouter192.168.1.6 host1192.168.1.101 repo192.168.1.102 airoot@host1:~# root@host1:~# svcs sendmail STATE STIME FMRIonline 20:56:42 svc:/network/smtp:sendmailroot@host1:~# root@host1:~# svcadm disable sendmail root@host1:~# svcs sendmail STATE STIME FMRIdisabled 21:33:08 svc:/network/smtp:sendmail root@host1:~# root@host1:~# archiveadm create /net/ai/var/ai/image-server/images/incoming.d/s 11.3sru4.5custom.uar -sInitializing Unified Archive creation resources...Unified Archive initialized: /net/ai/var/ai/image-server/images/incoming.d/s11.3sru4.5custom.uarLogging to: /system/volatile/archive_log.3464Executing dataset discovery...Dataset discovery completeCreating install media for zone(s)...Media creation completePreparing archive system image...Beginning archive stream creation...Archive stream creation completeBeginning final archive assembly...Archive creation completeroot@host1:~# root@host1:~# archiveadm info /net/ai/var/ai/image-server/images/incoming.d/s11 .3sru4.5custom.uar Archive Information Creation Time: 2016-04-30T02:18:26Z Source Host: host1 Architecture: sparc Operating System: Oracle Solaris 11.3 SPARC Deployable Systems: global,myzoneroot@host1:~# Instead of modifying the existing file, I am creating a new one.root@ai:/export/ai/manifests# diff s11.3.4uar.xml s11.3.4custom-uar.xml 12c12< <file uri="http://192.168.1.102:5555/uar.d/s11.3sru4.5.uar">---> <file uri="http://192.168.1.102:5555/uar.d/s11.3sru4.5custom.uar">root@ai:/export/ai/manifests# root@ai:/export/ai/manifests# installadm update-manifest -n default-sparc -m host1 \-f ./s11.3.4custom-uar.xml Changed Manifest: 'host1'root@ai:/export/ai/manifests# root@ai:/export/ai/manifests# mv /var/ai/image-server/images/incoming.d/s11.3sru4.5custom.uar \/var/ai/image-server/images/uar.d/ root@ai:/export/ai/manifests# I reinstall the host, and take a look.guest@host1:~$ diff /etc/ssh/sshd_config /etc/ssh/sshd_config.orig 102,104d101< < # Force KeepAlive messages< ClientAliveInterval 15guest@host1:~$ guest@host1:~$ cat /etc/hosts2add 192.168.1.1 myrouter192.168.1.6 host1192.168.1.101 repo192.168.1.102 aiguest@host1:~$ svcs sendmail STATE STIME FMRIdisabled 21:56:01 svc:/network/smtp:sendmailguest@host1:~$ As you can see, the modification of a file included in Solaris is maintained, my own file in /etc is preserved, and even an SFM state is kept as I had set it.This completes the process of creating and deploying a clone archive, including some basic customizations. You can use a similar procedure to also deploy your own or third party applications or packages. I have done that for a customer.SteffenRevision History(Other than minor typographical changes)2016.10.26: Posted2016.04.29: Created

Overview Now that we can deploy default or customized Solaris 11 configurations that are built during the installation, let's move to the next step to create a custom installation that can...

Solaris

Increasing Number of Open Files Without a Reboot

pre { display: block; font-family: monospace; white-space: pre; margin: 1em 0; background: lightgreen;} BackgroundLast week a customer told me that one of the applications they are moving to a new SuperCluster M7 requires 20,000 file descriptors per process! At first I thought they were kidding, however, after looking at the application's documentation, they indeed expect to have a large number of files open at once.After thinking about it a bit, I imagined there might be three ways of doing this.Use "/etc/system" for the whole DomainUse projects to increase the limit for the Solaris Zone(s) where this application runsUse projects within the Zone(s) to increase the limit for the user the application is running asUsing the old /etc/system method was not something I wanted to suggest. First, with a request for 20,000, setting rlim_fd_cur could allow any user to set the limit that high. Moreover, it requires a reboot of the Global Zone, something I like to avoid. In this case that would require each DB Zone in this Domain to be rebooted. Nope, lets move on right now.Looking at the zonecfg(1M) man page, I did not see how I could set the maximum number of open files (the process.max-file-descriptor resource.) And if I could, it would still apply to all processes and users in the Zone, which I did not like. (It turns out that the rctl setting only works on Zone related controls, those starting with "zone." in resource-controls(5).)So on to the third option, using the Projects facility to apply this high limit this to the specific user or process. I read pages such as projects(1), project(4), and resource-controls(5). This seemed more complicated than I had thought. The examples I found are more related to CPU or memory controls, than the number if file descriptors. I did see a reference to something similar in resource-controls(5) that looked interesting: process.max-file-size=(priv,5G,deny) for the maximum file size. Some hope.Using a Solaris Zone on a Solaris 11.3 system, I did a lot of testing. I could not get it to work. After reaching out to JeffV, he mentioned the three privileges: System, Privileged, and Basic. That was the key. Once I got it working, the steps are rather simple!The Default LimitsI gave the Solaris Zone I created for this test the nodename "limit" so I can tell it from different Zones I have installed, and to remind me in the Zone what I am using it for.Creating a User With Default Limitsroot@limit:~# ulimit -n 256root@limit:~# root@limit:~# groupadd -g 1001 test root@limit:~# root@limit:~# useradd -u 1001 -g test -d /export/home/steffen -m steffen 80 blocksroot@limit:~# passwd steffen New Password: xxxRe-enter new Password: xxxpasswd: password successfully changed for steffenroot@limit:~# Testing the UserAfter creating my test user, I log in as that user.root@limit:~# ssh steffen@localhost Password: xxxLast login: Tue Oct 4 14:09:37 2016 from localhostOracle Corporation SunOS 5.11 11.3 August 2016steffen@limit:~$ steffen@limit:~$ ulimit -n 256 steffen@limit:~$ A quick test shows the default number of file descriptors to be 256 as expected. And we know from experience that as a unprivileged (non-root) user we can only set this so much higher.steffen@limit:~$ ulimit -n 1024 steffen@limit:~$ ulimit -n 1024 steffen@limit:~$ steffen@limit:~$ ulimit -n 1025 -bash: ulimit: open files: cannot modify limit: Not ownersteffen@limit:~$ ulimit -n 1024 steffen@limit:~$ I can increase the default number of file descriptors per process to 1024, but not higher.Using Projects to Set a Higher LimitNow that I have this working, the steps to create a project with a limit is very easy. I already have the user from above.root@limit:~# projadd -K "process.max-file-descriptor=(basic,4096,deny)" proj.files root@limit:~# root@limit:~# usermod -K "project=proj.files" steffen root@limit:~# root@limit:~# cat /etc/user_attr ## Copyright (c) 1999, 2013, Oracle and/or its affiliates. All rights reserved.## The system provided entries are stored in different files# under "/etc/user_attr.d". They should not be copied to this file.## Only local changes should be stored in this file.#root::::type=role...steffen::::project=proj.files root@limit:~# The first step is to create the project. I am setting the number of file descriptors to 4096. The Privilege I mentioned before is key. It has to be "basic" since the user "steffen" has no enhanced privileges.The second step is to modify the user to set the default project to my new project "proj.files". I could have done this to the default project or to another project that already exists. I would not use the default project myself as it would allow too many processes to have a larger limit.So how does this look?root@limit:~# ssh steffen@localhost Password: Last login: Tue Oct 4 14:09:49 2016 from localhostOracle Corporation SunOS 5.11 11.3 August 2016steffen@limit:~$ steffen@limit:~$ ulimit -n 4096 steffen@limit:~$ What surprised me a bit, I guess since I don't really understand the Project facility in Solaris, is that automatically the number of file descriptors is set higher. I was expecting to have to run "ulimit" in a ".bash_rc" script.ConclusionSo there you have it. It is rather simple to create a new project and set one or more limits. Depending on how this goes with the customer, I may come back and show a project with several limits.SteffenAppreciationsThanks to JeffV and JimH who responded quickly to me calls and emails. Both mentioned "prctl(1)" which works on an active process. Because of that, I did notice the "rctl" in the zonecfg(1M) manual page. Revision History(Other than minor typographical changes)2016.10.04: Posted2016.10.04: Created

Background Last week a customer told me that one of the applications they are moving to a new SuperCluster M7 requires 20,000 file descriptors per process! At first I thought they were kidding,...

Solaris

Using an https Keystore for ZFS Encryption

pre { display: block; font-family: monospace; white-space: pre; margin: 1em 0; background: lightgreen;} OverviewRecently I wrote about how to enable ZFS encryption for your home directory, in a way that accepts the wrapping key when first logging into the system. This works when it is your home directory. But what about other file systems or pools that you want to encrypt and you want to mount without intervention after a system reboot? This discussion is about how to provide a wrapping key using an HTTPS service. For more details look at zfs_encrypt(1M).As I often do, these examples use Solaris Zones. I am running Solaris 11.3 SRU 07. One Zone is the HTTPS server, and the second Zone is where I create the ZFS File Systems.Configuring the HTTPS ServiceInstalling the Apache Web ServerThe first step is to install the Apache web server package. Zones use the package group solaris-small-server by default, which does not include the Apache web server package.root@myhttps:~# pkg install apache-22 Packages to install: 7 Mediators to change: 3 Services to change: 2 Create boot environment: NoCreate backup boot environment: NoDOWNLOAD PKGS FILES XFER (MB) SPEEDCompleted 7/7 1035/1035 9.7/9.7 28.1M/sPHASE ITEMSInstalling new actions 1241/1241Updating package state database Done Updating package cache 0/0 Updating image state Done Creating fast lookup database Done Updating package cache 3/3 root@myhttps:~# Configuring SSL in Apache httpd.confI must extend the default HTTP configuration to enable the SSL service. A good configuration file is in the "sample-conf.d" directory.root@myhttps:~# cd /etc/apache2/2.2/ root@myhttps:/etc/apache2/2.2# ls conf.d envvars httpd.conf magic mime.types original samples-conf.droot@myhttps:/etc/apache2/2.2# I like to save the original, especially if I want to show the differences easily. And I will append some comments to see where "ssl.conf" starts.root@myhttps:/etc/apache2/2.2# cp -p httpd.conf Httpd.conf.orig root@myhttps:/etc/apache2/2.2# root@myhttps:/etc/apache2/2.2# echo "###> ### End of Original httpd.conf> ###> " >> httpd.conf root@myhttps:/etc/apache2/2.2# root@myhttps:/etc/apache2/2.2# cat samples-conf.d/ssl.conf >> httpd.conf root@myhttps:/etc/apache2/2.2# root@myhttps:/etc/apache2/2.2# cp -p httpd.conf Httpd.conf.ssl.orig root@myhttps:/etc/apache2/2.2# After modifying for my configuration here are the differences.root@myhttps:/etc/apache2/2.2# diff httpd.conf Httpd.conf.ssl.orig 47c47< #Listen 80 ---> Listen 80107c107< ServerName 192.168.1.180 ---> ServerName 127.0.0.1533,534c533< #ServerName 127.0.0.1:443< ServerName 192.168.1.180:443 ---> ServerName 127.0.0.1:443553,554c552< #SSLCertificateFile "/etc/apache2/2.2/server.crt"< SSLCertificateFile "/etc/apache2/2.2/host180.crt" ---> SSLCertificateFile "/etc/apache2/2.2/server.crt"564,565c562< #SSLCertificateKeyFile "/etc/apache2/2.2/server.key"< SSLCertificateKeyFile "/etc/apache2/2.2/host180.key" ---> SSLCertificateKeyFile "/etc/apache2/2.2/server.key"root@myhttps:/etc/apache2/2.2# I replaced "server." with "host180." because I want to make managing my files easier. You can leave the "server" version and update the file names below. I also turned off port 80, for http access, to prevent sending data in clear text.Creating the Self Signed Root CertificateFirst step is to create a Root Certificate. I am putting the files into the "CA.d" directory I create so I can easily see the difference between the CA files and later web server certificate(s). I am using the prefix "host180CA" to identify anything having to do the the Root Certificate.root@myhttps:/etc/apache2/2.2# mkdir CA.d root@myhttps:/etc/apache2/2.2# root@myhttps:/etc/apache2/2.2# openssl genrsa -des3 -out CA.d/host180CA.key 2048 Generating RSA private key, 2048 bit long modulus.................+++....................................+++e is 65537 (0x10001)Enter pass phrase for CA.d/host180CA.key: XXX Verifying - Enter pass phrase for CA.d/host180CA.key: XXX root@myhttps:/etc/apache2/2.2# root@myhttps:/etc/apache2/2.2# openssl req -x509 -new -nodes -key CA.d/host180CA.key \-sha256 -days 1024 -out CA.d/host180CA.pem Enter pass phrase for CA.d/host180CA.key:You are about to be asked to enter information that will be incorporatedinto your certificate request.What you are about to enter is what is called a Distinguished Name or a DN.There are quite a few fields but you can leave some blankFor some fields there will be a default value,If you enter '.', the field will be left blank.-----Country Name (2 letter code) []:US State or Province Name (full name) []:NJ Locality Name (eg, city) []:MyTown Organization Name (eg, company) []:Oracle Organizational Unit Name (eg, section) []:SE Common Name (e.g. server FQDN or YOUR name) []:192.168.1.180 Email Address []:steffen@steffen.steffen root@myhttps:/etc/apache2/2.2# Creating the Server CertificateNow I create the certificates for this web server. I will be referencing the CA.d files from above. The server Certificates have the prefix "host180" because my IP address is 192.168.1.180. I am doing this to make it easier to recognize files.root@myhttps:/etc/apache2/2.2# openssl genrsa -out host180.key 2048 Generating RSA private key, 2048 bit long modulus........................+++...........................................+++e is 65537 (0x10001)root@myhttps:/etc/apache2/2.2# root@myhttps:/etc/apache2/2.2# openssl req -new -key host180.key -out host180.csr You are about to be asked to enter information that will be incorporatedinto your certificate request.What you are about to enter is what is called a Distinguished Name or a DN.There are quite a few fields but you can leave some blankFor some fields there will be a default value,If you enter '.', the field will be left blank.-----Country Name (2 letter code) []:US State or Province Name (full name) []:NJ Locality Name (eg, city) []:MyHost180 Organization Name (eg, company) []:Oracle Organizational Unit Name (eg, section) []:SEweb Common Name (e.g. server FQDN or YOUR name) []:192.168.1.180 Email Address []:host180@steffen.steffen Please enter the following 'extra' attributesto be sent with your certificate requestA challenge password []:An optional company name []:root@myhttps:/etc/apache2/2.2# root@myhttps:/etc/apache2/2.2# openssl x509 -req -in host180.csr -CA CA.d/host180CA.pem \-CAkey CA.d/host180CA.key -CAcreateserial -out host180.crt -days 1000 -sha256 Signature oksubject=/C=US/ST=NJ/L=MyHost180/O=Oracle/OU=SEweb/CN=192.168.1.180/emailAddress=host180@steffen.steffenGetting CA Private KeyEnter pass phrase for CA.d/host180CA.key: XXX root@myhttps:/etc/apache2/2.2# Here are all the files that end up getting created.root@myhttps:/etc/apache2/2.2# ls -l total 398drwxr-xr-x 2 root root 4 Jun 8 17:33 CA.d-rw-r--r-- 1 root root 17 Jun 8 17:39 CA.srldrwxr-xr-x 2 root sys 4 Jun 8 17:20 conf.d-rw-r--r-- 1 root bin 896 Jun 8 17:20 envvars-rw-r--r-- 1 root root 1306 Jun 8 17:39 host180.crt-rw-r--r-- 1 root root 1058 Jun 8 17:37 host180.csr-rw-r--r-- 1 root root 1675 Jun 8 17:36 host180.key-rw-r--r-- 1 root bin 26114 Jun 8 17:29 httpd.conf-rw-r--r-- 1 root bin 13673 Jun 8 17:20 Httpd.conf.orig-rw-r--r-- 1 root bin 25975 Jun 8 17:26 Httpd.conf.ssl.orig-rw-r--r-- 1 root bin 12958 Jun 8 17:20 magic-rw-r--r-- 1 root bin 53011 Jun 8 17:20 mime.typesdrwxr-xr-x 2 root sys 3 Jun 8 17:20 originaldrwxr-xr-x 2 root sys 15 Jun 8 17:20 samples-conf.droot@myhttps:/etc/apache2/2.2# ls CA.d/ host180CA.key host180CA.pemroot@myhttps:/etc/apache2/2.2# Creating the ZFS Encryption Wrapping KeyI need a key that ZFS will use as the wrapping key. This is a short one. You may have some mechanism to create a longer one.root@myhttps:/etc/apache2/2.2# pktool genkey keystore=file \outkey=/var/apache2/2.2/htdocs/zfs-aes-256.key keytype=aes keylen=256 root@myhttps:/etc/apache2/2.2# root@myhttps:/etc/apache2/2.2# chmod +r /var/apache2/2.2/htdocs/zfs-aes-256.key root@myhttps:/etc/apache2/2.2# ls -l /var/apache2/2.2/htdocs/zfs-aes-256.key -r--r--r-- 1 root root 32 Jun 8 17:41 /var/apache2/2.2/htdocs/zfs-aes-256.keyroot@myhttps:/etc/apache2/2.2# By default the key is readable only by the user that creates it, in this case "root". If you don't make it readable by all, since Apache runs as "daemon" by default, you will not be able to access it over HTTP/HTTPS.Starting the Web ServerNow that I have done all my configurations, lets start it up.root@myhttps:/etc/apache2/2.2# svcs *apache* STATE STIME FMRIdisabled 17:20:17 svc:/network/http:apache22root@myhttps:/etc/apache2/2.2# root@myhttps:/etc/apache2/2.2# svcadm enable apache22 root@myhttps:/etc/apache2/2.2# root@myhttps:/etc/apache2/2.2# svcs *apache* STATE STIME FMRIonline 17:44:01 svc:/network/http:apache22root@myhttps:/etc/apache2/2.2# One final check to make sure all services are running fine.root@myhttps:/etc/apache2/2.2# svcs -x root@myhttps:/etc/apache2/2.2# root@myhttps:/etc/apache2/2.2# netstat -anf inet ...TCP: IPv4 Local Address Remote Address Swind Send-Q Rwind Recv-Q State-------------------- -------------------- ------- ------ ------- ------ -----------127.0.0.1.5999 *.* 0 0 128000 0 LISTEN *.111 *.* 0 0 128000 0 LISTEN *.* *.* 0 0 128000 0 IDLE *.111 *.* 0 0 128000 0 LISTEN *.* *.* 0 0 128000 0 IDLE *.22 *.* 0 0 128000 0 LISTEN *.22 *.* 0 0 128000 0 LISTEN127.0.0.1.4999 *.* 0 0 128000 0 LISTEN127.0.0.1.25 *.* 0 0 128000 0 LISTEN127.0.0.1.587 *.* 0 0 128000 0 LISTEN *.* *.* 0 0 128000 0 IDLE*.443 *.* 0 0 128000 0 LISTEN *.* *.* 0 0 128000 0 IDLEroot@myhttps:/etc/apache2/2.2# Everything looks good. On to the Zone where I will do the ZFS work.Creating an Encrypted File System using a Keystore via HTTPSAdding Self Signed Certificate to an HTTPS ClientI need to do two steps to be able to access the https service. First, I need to load the certificate for the web server into the local CA directory. I get this certificate using the "openssl" command.Here is the complete output. To end the command, enter " D". (It doesn't show up in the output.)root@ezfs:~# openssl s_client -connect 192.168.1.180:443 CONNECTED(00000003)depth=0 C = US, ST = NJ, L = MyHost180, O = Oracle, OU = SEweb, CN = 192.168.1.180, emailAddress = host180@steffen.steffenverify error:num=20:unable to get local issuer certificateverify return:1depth=0 C = US, ST = NJ, L = MyHost180, O = Oracle, OU = SEweb, CN = 192.168.1.180, emailAddress = host180@steffen.steffenverify error:num=21:unable to verify the first certificateverify return:1---Certificate chain 0 s:/C=US/ST=NJ/L=MyHost180/O=Oracle/OU=SEweb/CN=192.168.1.180/emailAddress=host180@steffen.steffen i:/C=US/ST=NJ/L=MyTown/O=Oracle/OU=SE/CN=192.168.1.180/emailAddress=steffen@steffen.steffen---Server certificate-----BEGIN CERTIFICATE-----MIIDljCCAn4CCQCfpX0OhjSiMTANBgkqhkiG9w0BAQsFADCBiTELMAkGA1UEBhMCVVMxCzAJBgNVBAgMAk5KMQ8wDQYDVQQHDAZNeVRvd24xDzANBgNVBAoMBk9yYWNsZTELMAkGA1UECwwCU0UxFjAUBgNVBAMMDTE5Mi4xNjguMS4xODAxJjAkBgkqhkiG9w0BCQEWF3N0ZWZmZW5Ac3RlZmZlbi5zdGVmZmVuMB4XDTE2MDYwODIxMzkwNloXDTE5MDMwNTIxMzkwNlowgY8xCzAJBgNVBAYTAlVTMQswCQYDVQQIDAJOSjESMBAGA1UEBwwJTXlIb3N0MTgwMQ8wDQYDVQQKDAZPcmFjbGUxDjAMBgNVBAsMBVNFd2ViMRYwFAYDVQQDDA0xOTIuMTY4LjEuMTgwMSYwJAYJKoZIhvcNAQkBFhdob3N0MTgwQHN0ZWZmZW4uc3RlZmZlbjCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAM4wzBQRsqz2lwH1sauTSnx6fpig40EBaFHRLCblvWgAYLgTY1ccV39X2zryjKIMu0vDnBALQOu4/IMZ+7tcZNnSBPrKhqC/YgwWmY0GINvbFG0AQ1aIm/KEqeHLzVhuEAjGc1tUvhWT1MxvooHshtR1KGzZv9gq6fnBprqOz0Es6VNWPLX3rmkryFhlE+tYHOhEgBAMEhiQ6Ait/pORMGG5XRaLkXsXNjrEHD8YXD4VbvPl8GoMQCwSZ9M3DFA18IfpDaB5ByUihhFWV2NxcBBfCqCBd6v0bdh1nyAJ5zhZGmEHYztt13WqiJ315pgyyhXnAne0SDycdRQrxpPs21ECAwEAATANBgkqhkiG9w0BAQsFAAOCAQEAY+2RTjLylWaaHO4xDYGuDW8k2XkxsH+BkRcDtRM0g1iliHgQLSxGqdsKr4fK4WWC7Vbfm0Cbl47T3ny+rNvyT6ac/VhfwI/GDIOGwV+mzoVio5QlZh601gclDv5M4j8633Wr/SCcc8ZFB6FOAfqaLDtZryfHCUbppL2AnSPY6JFQG4Cv5Uo/nTTs4vyL4JwRl/cQNLXY6GCQRMjAwrfdjj2wBczrbEK1qzu0gD4crkB/XpyJFZq32RSvWtE3nVV9GU93ErLYC1BxQHvrYYWVlIv59sIQ4DYec0b/mxs9HnjHVA4sveTg1CjUXRY+eYpPF7OlHa9v2EV4l7T2IB0ZLg==-----END CERTIFICATE----- subject=/C=US/ST=NJ/L=MyHost180/O=Oracle/OU=SEweb/CN=192.168.1.180/emailAddress=host180@steffen.steffenissuer=/C=US/ST=NJ/L=MyTown/O=Oracle/OU=SE/CN=192.168.1.180/emailAddress=steffen@steffen.steffen---No client certificate CA names sent---SSL handshake has read 2055 bytes and written 463 bytes---New, TLSv1/SSLv3, Cipher is DHE-RSA-AES256-GCM-SHA384Server public key is 2048 bitSecure Renegotiation IS supportedCompression: NONEExpansion: NONESSL-Session: Protocol : TLSv1.2 Cipher : DHE-RSA-AES256-GCM-SHA384 Session-ID: 552375FF9881568181BC0DCEBBD238D913DCB55381FD9A2ADED7413B00AC9078 Session-ID-ctx: Master-Key: F8D5B3E7C4FF7B8396FAEC8FAEBA0865E8790335E1A09B9703F217125C5D3EB7220D79E24F4510C35F8E500DFFC1C06D Key-Arg : None PSK identity: None PSK identity hint: None SRP username: None TLS session ticket lifetime hint: 300 (seconds) TLS session ticket: 0000 - c1 35 38 cb eb 88 92 85-28 50 7e c5 cc 4f f8 4d .58.....(P~..O.M 0010 - 64 a7 61 7f 8f bb 09 8b-c3 b6 0b fe a4 1f 50 ce d.a...........P. 0020 - d5 b2 0c 82 97 9a 86 69-d2 76 ea d1 19 f3 40 fb .......i.v....@. 0030 - 0e 95 6b cd 9d e2 09 f5-de 52 bb 14 c7 f9 fc 6f ..k......R.....o 0040 - 1c 39 7f e3 3b 9a 9b 95-be 79 df 39 19 fc f3 6f .9..;....y.9...o 0050 - 6a 12 7a 5b b5 ea 1e 03-6f 44 01 b5 74 8b 7c 4f j.z[....oD..t.|O 0060 - 7a 61 8a d0 39 bb 7f 72-f1 99 81 57 57 2d b3 e1 za..9..r...WW-.. 0070 - 70 82 1b 87 33 35 95 15-62 05 07 46 bc 6f ab f1 p...35..b..F.o.. 0080 - c6 06 5a c3 4d 86 9d d0-db 2f 9a d4 70 97 98 9b ..Z.M..../..p... 0090 - 41 74 bb dd 03 33 7c dd-c2 20 ad bc ac c1 29 ad At...3|.. ....). 00a0 - de dd 72 8a 8b 32 74 10-8d 9b 45 38 f5 27 a3 d3 ..r..2t...E8.'.. 00b0 - e1 f6 d1 d6 0b 07 6e 08-cf 76 2c 7a 51 25 c6 b3 ......n..v,zQ%.. Start Time: 1465422516 Timeout : 300 (sec) Verify return code: 21 (unable to verify the first certificate)---DONEroot@ezfs:~# I need the text between the "BEGIN" and "END CERTIFICATE" lines, including those lines. I send the output to a file, and then remove the content except the "CERTIFICATE" part.root@ezfs:~# openssl s_client -connect 192.168.1.180:443 > /tmp/host180.pem depth=0 C = US, ST = NJ, L = MyHost180, O = Oracle, OU = SEweb, CN = 192.168.1.180, emailAddress = host180@steffen.steffenverify error:num=20:unable to get local issuer certificateverify return:1depth=0 C = US, ST = NJ, L = MyHost180, O = Oracle, OU = SEweb, CN = 192.168.1.180, emailAddress = host180@steffen.steffenverify error:num=21:unable to verify the first certificateverify return:1DONEroot@ezfs:~# root@ezfs:~# vi /tmp/host180.pem root@ezfs:~# root@ezfs:~# cat /tmp/host180.pem -----BEGIN CERTIFICATE-----MIIDljCCAn4CCQCfpX0OhjSiMTANBgkqhkiG9w0BAQsFADCBiTELMAkGA1UEBhMCVVMxCzAJBgNVBAgMAk5KMQ8wDQYDVQQHDAZNeVRvd24xDzANBgNVBAoMBk9yYWNsZTELMAkGA1UECwwCU0UxFjAUBgNVBAMMDTE5Mi4xNjguMS4xODAxJjAkBgkqhkiG9w0BCQEWF3N0ZWZmZW5Ac3RlZmZlbi5zdGVmZmVuMB4XDTE2MDYwODIxMzkwNloXDTE5MDMwNTIxMzkwNlowgY8xCzAJBgNVBAYTAlVTMQswCQYDVQQIDAJOSjESMBAGA1UEBwwJTXlIb3N0MTgwMQ8wDQYDVQQKDAZPcmFjbGUxDjAMBgNVBAsMBVNFd2ViMRYwFAYDVQQDDA0xOTIuMTY4LjEuMTgwMSYwJAYJKoZIhvcNAQkBFhdob3N0MTgwQHN0ZWZmZW4uc3RlZmZlbjCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAM4wzBQRsqz2lwH1sauTSnx6fpig40EBaFHRLCblvWgAYLgTY1ccV39X2zryjKIMu0vDnBALQOu4/IMZ+7tcZNnSBPrKhqC/YgwWmY0GINvbFG0AQ1aIm/KEqeHLzVhuEAjGc1tUvhWT1MxvooHshtR1KGzZv9gq6fnBprqOz0Es6VNWPLX3rmkryFhlE+tYHOhEgBAMEhiQ6Ait/pORMGG5XRaLkXsXNjrEHD8YXD4VbvPl8GoMQCwSZ9M3DFA18IfpDaB5ByUihhFWV2NxcBBfCqCBd6v0bdh1nyAJ5zhZGmEHYztt13WqiJ315pgyyhXnAne0SDycdRQrxpPs21ECAwEAATANBgkqhkiG9w0BAQsFAAOCAQEAY+2RTjLylWaaHO4xDYGuDW8k2XkxsH+BkRcDtRM0g1iliHgQLSxGqdsKr4fK4WWC7Vbfm0Cbl47T3ny+rNvyT6ac/VhfwI/GDIOGwV+mzoVio5QlZh601gclDv5M4j8633Wr/SCcc8ZFB6FOAfqaLDtZryfHCUbppL2AnSPY6JFQG4Cv5Uo/nTTs4vyL4JwRl/cQNLXY6GCQRMjAwrfdjj2wBczrbEK1qzu0gD4crkB/XpyJFZq32RSvWtE3nVV9GU93ErLYC1BxQHvrYYWVlIv59sIQ4DYec0b/mxs9HnjHVA4sveTg1CjUXRY+eYpPF7OlHa9v2EV4l7T2IB0ZLg==-----END CERTIFICATE-----root@ezfs:~# I copy the file into the Certificate Authority directory.root@ezfs:~# cp /tmp/host180.pem /etc/certs/CA/ root@ezfs:~# Because this is a Self Signed Certificate, I also need the file I use to sign certificates. That is on the web server.root@ezfs:~# scp guest@192.168.1.180:/etc/apache2/2.2/CA.d/host180CA.pem /tmp The authenticity of host '192.168.1.180 (192.168.1.180)' can't be established.RSA key fingerprint is 1b:62:9b:5c:42:f9:44:c9:d1:81:99:c4:e3:c0:3f:0f.Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '192.168.1.180' (RSA) to the list of known hosts.Password: XXX host180CA.pem 100% |**********************************************| 1415 00:00 root@ezfs:~# root@ezfs:~# cp /tmp/host180CA.pem /etc/certs/CA/ root@ezfs:~# With both files in the directory, I have the CA service refresh to read the files.root@ezfs:~# svcs *cert* STATE STIME FMRIonline 17:01:51 svc:/system/ca-certificates:defaultroot@ezfs:~# root@ezfs:~# svcadm refresh ca-certificates root@ezfs:~# root@ezfs:~# svcs *cert* STATE STIME FMRIonline 17:53:56 svc:/system/ca-certificates:defaultroot@ezfs:~# Any easy way I found to verify that this works is the "wget(1)" command. Its output is useful in understanding when my certificates are not working as well.root@ezfs:~# (cd /tmp ; wget https://192.168.1.180/zfs-aes-256.key ) --2016-06-08 17:54:10-- https://192.168.1.180/zfs-aes-256.keyConnecting to 192.168.1.180:443... connected.HTTP request sent, awaiting response... 200 OKLength: 32 [text/plain]Saving to: ‘zfs-aes-256.key’zfs-aes-256.key 100%[=================================>] 32 --.-KB/s in 0s 2016-06-08 17:54:10 (3.57 MB/s) - ‘zfs-aes-256.key’ saved [32/32]root@ezfs:~# root@ezfs:~# rm /tmp/zfs-aes-256.key root@ezfs:~# I delete the file right away as I only want it accessible via https.Create the ZFS File SystemsNow to the real tack at hand, creating a ZFS file system with encryption and the wrapping key accessed using https. I also create one that requires manual input to show the difference. I am using the "rpool/export" directory as my base.root@ezfs:~# zfs list NAME USED AVAIL REFER MOUNTPOINTrpool 61.6M 156G 144K /rpoolrpool/ROOT 58.1M 156G 144K legacyrpool/ROOT/solaris-0 58.1M 156G 1.48G /rpool/ROOT/solaris-0/var 2.60M 156G 174M /varrpool/VARSHARE 3M 156G 2.76M /var/sharerpool/VARSHARE/pkg 296K 156G 152K /var/share/pkgrpool/VARSHARE/pkg/repositories 144K 156G 144K /var/share/pkg/repositoriesrpool/export 360K 156G 152K /export rpool/export/home 256K 156G 152K /export/homerpool/export/home/guest 152K 156G 152K /export/home/guestroot@ezfs:~# root@ezfs:~# zfs create -o encryption=on \-o keysource=passphrase,prompt rpool/export/prompt Enter passphrase for 'rpool/export/prompt': XXX Enter again: XXX root@ezfs:~# root@ezfs:~# zfs create -o encryption=on \-o keysource=raw,https://192.168.1.180:443/zfs-aes-256.key rpool/export/https root@ezfs:~# I put some data into the two file system to test with later.root@ezfs:~# date > /export/https/date root@ezfs:~# date > /export/prompt/date root@ezfs:~# root@ezfs:~# ls /export/* /export/home:guest/export/https:date /export/prompt:date root@ezfs:~# more /export/*/date ::::::::::::::/export/https/date::::::::::::::Wednesday, June 8, 2016 05:59:59 PM EDT::::::::::::::/export/prompt/date::::::::::::::Wednesday, June 8, 2016 06:00:07 PM EDTroot@ezfs:~# root@ezfs:~# zfs list NAME USED AVAIL REFER MOUNTPOINTrpool 61.9M 156G 144K /rpoolrpool/ROOT 58.1M 156G 144K legacyrpool/ROOT/solaris-0 58.1M 156G 1.48G /rpool/ROOT/solaris-0/var 2.60M 156G 174M /varrpool/VARSHARE 3M 156G 2.76M /var/sharerpool/VARSHARE/pkg 296K 156G 152K /var/share/pkgrpool/VARSHARE/pkg/repositories 144K 156G 144K /var/share/pkg/repositoriesrpool/export 720K 156G 168K /exportrpool/export/home 256K 156G 152K /export/homerpool/export/home/guest 152K 156G 152K /export/home/guestrpool/export/https 172K 156G 172K /export/httpsrpool/export/prompt 172K 156G 172K /export/prompt root@ezfs:~# root@ezfs:~# halt [Connection to zone 'ezfs' pts/10 closed]Validating Hands-Free Operation After a RebootThe keys for encrypted ZFS file system are only required when they are first accessed. I am using Solaris Zones, and don't want to reboot my system. So to simulate a reboot I "unload" the keys for all the file system in the zone. (There is only one file system with a key, however, this would do all if there were more than one.)root@global# zfs key -u -r pool1/zones/ezfs root@global# root@global# zoneadm -z ezfs boot root@global# Test and Manually Mount the "prompt" File SystemOnce the zone boots, lets check what data is available.root@global# zlogin ezfs [Connected to zone 'ezfs' pts/10]Last login: Wed Jun 8 17:05:53 2016 on pts/10Oracle Corporation SunOS 5.11 11.3 March 2016root@ezfs:~# root@ezfs:~# ls /export/* /export/home:guest/export/https:date /export/prompt:root@ezfs:~# root@ezfs:~# more /export/*/date Wednesday, June 8, 2016 05:59:59 PM EDTroot@ezfs:~# As you can see, only the "https" directory shows the "date" file. I manually mount the "prompt" file system.root@ezfs:~# zfs mount rpool/export/prompt Enter passphrase for 'rpool/export/prompt': XXX root@ezfs:~# root@ezfs:~# more /export/*/date ::::::::::::::/export/https/date::::::::::::::Wednesday, June 8, 2016 05:59:59 PM EDT::::::::::::::/export/prompt/date::::::::::::::Wednesday, June 8, 2016 06:00:07 PM EDTroot@ezfs:~# Now both "date" files are available.SummaryThis was a quick and simple walk through of the steps to automatically mount an encrypted file system without using a local keysource file. Thank you and good luck with you ZFS experiences!SteffenAppreciationsThanks to DarrenM for his repeated replies to my email requests for help, and to BartS for his quick reply as well.Thank you to "The Data Center Overlords" for the high level steps that got me started on how to set up my own Certificate Authority and server certificates.Revision History(Other than minor typographical changes)2016.06.08: Posted2016.06.08: Created

Overview Recently I wrote about how to enable ZFS encryption for your home directory, in a way that accepts the wrapping key when first logging into the system. This works when it is your home...

Solaris

Encrypting my Home Directory on ZFS

pre { display: block; font-family: monospace; white-space: pre; margin: 1em 0; background: lightgreen;} OverviewI like to run Solaris on my work desktops because I have all the Solaris features at my fingertips. This included the manual pages, Solaris Zones, Solaris networking including VNICs, and I just find the Solaris GNOME desktop to the most easy for me to use for basic email, browsing, terminal windows, and the like.Because I might be putting some information on my desktop that I'd rather not leave when the disk drive leaves, I make an effort to encrypt my home directory. Because I do this relatively infrequently, I don't remember the steps, so search for and I reference Darren Moffat's blog. Unfortunately, it was written in 2011 when Solaris 11 11/11 delivered ZFS encryption, and it seems some files have changed slightly.To make it easier for me to reference, and to add some additional features, I did some repeated testing of modification of the PAM module and am posting the steps in this blog entry. I make no effort to explain PAM, as I am not that versed in it.The Default ConfigurationI am using one of my desktops to write this, and I will use as Solaris Zone to show what a fresh installation looks like. Darren's example shows how to enable encryption with the GNOME Display Manager (GDM.) I will extend this to work with console or ssh login. Testing the GDM configuration does require me to log out of my desktop, and is a bit more intrusive for me to test and show. Testing and documenting console and ssh logins are easy with a Zone.root@pamzone:~# cd /etc/pam.d root@pamzone:/etc/pam.d# root@pamzone:/etc/pam.d# ls cron gdm-autologin other pfexeccups login passwd tpdloginroot@pamzone:/etc/pam.d# Here are default files in a Solaris 11.3 default installation using the Live Media. I highlight the files I will be changes. In addition, I will be adding a gdm file that is not yet there.Modifying the ConfigurationBecause I am a bit conservative, even though this is a Zone, I will make a best effort to be able to revert to the original configuration. Also, I can highlight differences.root@pamzone:/etc/pam.d# beadm create initial root@pamzone:/etc/pam.d# root@pamzone:/etc/pam.d# cp -p login login.orig root@pamzone:/etc/pam.d# cp -p other other.orig root@pamzone:/etc/pam.d# I modify the "login" and "other" files based on the changes Darren put into the "/etc/pam.conf" file. The GDM specific entries go into "gdm".root@pamzone:/etc/pam.d# diff login login.orig 14,16d13< # 2016.05.04 Added for encrypting user's home directory< # Create a new home directory if it does not exist< auth required pam_zfs_key.so.1 create homes=rpool/export/home root@pamzone:/etc/pam.d# root@pamzone:/etc/pam.d# diff other other.orig 14,16d13< # 2016.05.04 Added for encrypting user's home directory< # This allows new account without coming in on console< auth required pam_zfs_key.so.1 create homes=rpool/export/home49,51d45< # 2016.05.04 Added for encrypting user's home directory< # Update the ZFS encryption wrapping key when the user changes their password< password requisite pam_zfs_key.so.1 homes=rpool/export/home root@pamzone:/etc/pam.d# root@pamzone:/etc/pam.d# cat gdm # 2016.05.04 Created based on https://blogs.oracle.com/darren/entry/user_user_home_directory_encryptionauth requisite pam_authtok_get.so.1auth required pam_unix_cred.so.1auth required pam_unix_auth.so.1# 2016.05.04 Added for encrypting user's home directory# Create a new home directory if it does not existauth required pam_zfs_key.so.1 create homes=rpool/export/home# 2016.05.04 End of ZFS encrytion changesauth required pam_unix_auth.so.1 While Darren shows putting the ZFS encryption features into "/etc/pam.conf" I am putting them into the per-service files in "/etc/pam.d/" as the /etc/pam.conf comments recommend. This has required some testing and retesting for me to get this fully working, which is why I am creating this blog.Modifying the ConfigurationThe way to test this is to create a new user. Since I am doing this in a Solaris Zone I can only test text console and network logins. I will demonstrate both, and come back later to show GDM.First steps are to create the users and to force them to enter a new password when they first log in.root@pamzone:~# useradd -g 10 -c "user1" -d /export/home/user1 user1 root@pamzone:~# useradd -g 10 -c "user2" -d /export/home/user2 user2 root@pamzone:~# root@pamzone:~# passwd user1 New Password: xxxRe-enter new Password: xxxpasswd: password successfully changed for user1root@pamzone:~# passwd user2 New Password: xxxRe-enter new Password: xxxpasswd: password successfully changed for user2root@pamzone:~# root@pamzone:~# passwd -f user1 passwd: password information changed for user1root@pamzone:~# passwd -f user2 passwd: password information changed for user2root@pamzone:~# The "-f" option forces the user to enter a new password on their next login by expiring it. Thus only the user knows the password for the wrapping key.Testing the New UsersNow I will log into the Zone's console from the Global Zone to show the console login step.admin@global:~$ pfexec zlogin -C pamzone [Connected to zone 'pamzone' console]pamzone console login: user1 Password: xxxChoose a new password.New Password: xxxRe-enter new Password: xxxlogin: password successfully changed for user1Creating home directory with encryption=on.Your login password will be used as the wrapping key.Oracle Corporation SunOS 5.11 11.3 February 2016-bash-4.1$ -bash-4.1$ pwd /export/home/user1-bash-4.1$ /usr/sbin/zfs get encryption rpool/export/home/user1 NAME PROPERTY VALUE SOURCErpool/export/home/user1 encryption on local -bash-4.1$ As you can see, a home directory is created automatically, and encryption is set to "on".The second test is to login in remotely. I am simulating that by going to localhost just for convenience.root@pamzone:~# ssh user2@localhost The authenticity of host 'localhost (::1)' can't be established.RSA key fingerprint is 1d:e5:ff:2d:1f:b2:db:a0:0a:ff:3b:53:db:e6:3c:68.Are you sure you want to continue connecting (yes/no)? yesWarning: Permanently added 'localhost' (RSA) to the list of known hosts.Password: xxxWarning: Your password has expired, please change it now.New Password: xxxRe-enter new Password: xxxsshd-kbdint: password successfully changed for user2Creating home directory with encryption=on.Your login password will be used as the wrapping key.Oracle Corporation SunOS 5.11 11.3 February 2016-bash-4.1$ -bash-4.1$ pwd /export/home/user2 -bash-4.1$ /usr/sbin/zfs get encryption rpool/export/home/user1 NAME PROPERTY VALUE SOURCErpool/export/home/user1 encryption on local -bash-4.1$ Again, the ZFS encryption property validates that encryption is on.Changing a PasswordIt is good to know I can have my home directory encrypted automatically when I log in the first time. What happens when it is time for me to change my password? Let's see.-bash-4.1$ passwd passwd: Changing password for user2Enter existing login password: xxxNew Password: xxxRe-enter new Password: xxxpasswd: password successfully changed for user2ZFS Key change for rpool/export/home/user2 successful -bash-4.1$ As you can see, the ZFS wrapping key is updated when I run the "passwd(1)" command.Mounting an Encrypted File System/Home Directory on RebootThe above steps created and mounted the users' home directories. Let us take a look what happens on a reboot. The experience is different in a Zone reboot than it is on a system reboot.root@auth:~# zfs get -r encryption rpool/export/home NAME PROPERTY VALUE SOURCErpool/export/home encryption off -rpool/export/home/guest encryption off -rpool/export/home/user1 encryption on localrpool/export/home/user2 encryption on localroot@auth:~# root@auth:~# ls /export/home/*/ /export/home/guest/:/export/home/user1/:test1/export/home/user2/:test2root@auth:~# root@auth:~# reboot [Connection to zone 'auth' pts/3 closed]root@global:~# ...root@global:~# zlogin auth [Connected to zone 'auth' pts/3]Oracle CorporationSunOS 5.1111.3February 2016root@auth:~# root@auth:~# ls /export/home/*/ /export/home/guest/:/export/home/user1/:test1/export/home/user2/:test2 root@auth:~# root@auth:~# ls /export/home/*/test* /export/home/user1/test1 /export/home/user2/test2 root@auth:~# Though the Zone was rebooted, it is not necessary to provide wrapping keys for the encrypted file systems. Now let's see what happens when the system reboots.root@auth:~# ls /export/home/*/test* /export/home/*/test*: No such file or directory root@auth:~# root@auth:~# ssh user1@localhost Password: xxxOracle Corporation SunOS 5.11 11.3 February 2016-bash-4.1$ ls test1 -bash-4.1$ exit logoutConnection to localhost closed.root@auth:~# root@auth:~# ls /export/home/*/test* /export/home/user1/test1 root@auth:~# root@auth:~# zfs mount rpool/export/home/user2 Enter passphrase for 'rpool/export/home/user2': xxxroot@auth:~# root@auth:~# ls /export/home/*/test* /export/home/user1/test1 /export/home/user2/test2 root@auth:~# Upon a system reboot it is necessary to provide the wrapping key. With the custom PAM setup, when user1 logs in, the key is provided to mount user1's home directory. A second way of providing the key is when performing a "zfs mount" operation. In the second case, the user with the privileges to run the command must know the wrapping key. I have done that when I access my system remotely after a reboot, and before I added the extra option to mount on remote access.Testing the Configuration when Logging In on a DesktopIn order to capture the desktop login experience, I need to enable remote GNOME login. I followed the steps at Setting Up Remote Desktop Access Using VNC in the Solaris 11.3 Desktop Admiminstrator's Guide and tips on Calkins' Blog.Because desktops need to access devices not available in a Solaris Zone, I created another user user3 in the Global Zone on the system. (First I create a new Boot Environment and reboot into that, so I can delete the changes to the Global Zone.)Once all set up, I log in. Because I force the user to enter a new password, I am prompted to do so. I enter it (twice.) I am told it is successful. As with on the console or a network connection, the system tells me that I have successfully encrypted. And I can verify that by looking at the ZFS encryption attribute. It is on! So this shows the GNOME version of first login and changing the password to set the ZFS encryption wrapping key. In SummaryNow you can encrypt your home directory and make sure the wrapping key is up to date whenever you change your password.I will add one small item since we are changing the PAM configuration files. In Solaris 11.3, when the system is rebooted for any reason, a new feature asked for by some customers is to remind the user of the last login in. This display disappears after ten second or so. To get rid of it quicker you might have to click on OK. AlanC at The Observatory writes how to get rid of that. Here is how do that.root@global:~# grep nowarn /etc/pam.d/gdm session requiredpam_unix_session.so.1nowarn root@global:~# Hopefully this all makes someone's life a bit easier and more secure.SteffenRevision History(Other than minor typographical changes)2016.05.10: Corrected "other" to "gdm" in how to avoid last login warnings2016.05.06: Small changes to my comments and descriptions2016.05.05: Posted2016.05.03: Created

Overview I like to run Solaris on my work desktops because I have all the Solaris features at my fingertips. This included the manual pages, Solaris Zones, Solaris networking including VNICs, and I...

Solaris Networking

Using Aggregations and VLANs with LDoms and Zones

pre { display: block; font-family: monospace; white-space: pre; margin: 1em 0; background: lightgreen;} Often people ask how to use link aggregations and VLANs with Oracle VM Server for SPARC (Logical Domain or LDoms). My goal here is to give a brief description and steps how to configure a link aggregation in a Serivce Domain (in this case also the Control Domain) and then set up different VLAN configurations.I am showing this with Solaris 11.3, though the steps will work for any Solaris 11. Due to networking differences in Solaris 10, the principles will apply yet the steps will be different.My SetupI am using a T4-1 for the system to demonstrate the networking and LDom set, a T5120 as are remote system on the network, and a Netgear GS716T Smart Switch between the two. The GS716T can do link aggregation, but not IEEE 802.3ad LACP. Solaris supports link aggregation with or without LACP, and since Solaris 11.1, also using Data Link MultiPathing (DLMP.) The functionality and steps are almost identical except in some options when setting up the link aggregation.I find it useful for my understanding to see not just command line input. I also like to see the output, and validation that steps I perform actually do something. When doing network configurations, I prefer to see network traffic. This session will include all of that. For networking that requires a second system, and I will show the setup of that as well. I hope this is useful for others.The Remote SystemFor my network target, as it were, I am using a SPARC T5120 running Solaris 11.2. The actual release is not as important for this, as I am using only basic VLAN features.Initial network configuration is as follows. It has some other things on it, that I am cutting from the output as it is not relevant to this topic.root@remote# dladm show-physLINK MEDIA STATE SPEED DUPLEX DEVICEnet1 Ethernet up 1000 full e1000g1net2 Ethernet up 1000 full e1000g2net0 Ethernet up 1000 full e1000g0net3 Ethernet up 1000 full e1000g3root@remote# dladm show-linkLINK CLASS MTU STATE OVERnet1 phys 1500 up --net2 phys 1500 up --net0 phys 1500 up --net3 phys 9000 up --root@remote# dladm show-vlanroot@remote# First I will create three VLANs that are configured on the switch, 111, 112, and 113. root@remote# dladm create-vlan -l net3 -v 111 net3111root@remote# dladm create-vlan -l net3 -v 112 net3111root@remote# dladm create-vlan -l net3 -v 113 net3113root@remote# root@remote# dladm show-linkLINK CLASS MTU STATE OVERnet1 phys 1500 up --net2 phys 1500 up --net0 phys 1500 up --net3 phys 9000 up --net3111 vlan 9000 up net3net3112 vlan 9000 up net3net3113 vlan 9000 up net3root@remote# root@remote# dladm show-vlanLINK VID SVID PVLAN-TYPE FLAGS OVERnet3111 111 -- -- ----- net3net3112 112 -- -- ----- net3net3113 113 -- -- ----- net3root@remote# If I had not set the data link name, in my case net3111 for the first one, Solaris would have used the old PPA (Physical Point of Attachment) standard that has been used in Solaris for a long time. They would have been net111003, net112003, and net113003. Those names require more typing. I do like naming where it is easy to recognize the data link the VLAN is on as well as the VLAN ID.Next step is to put some IP address on those VLAN. I use 192.168.VLAN.x as my subnet, and I set "x" to the host part of the IP address of the base system, in this case "1".root@remote# ipadm create-ip net3111root@remote# ipadm create-ip net3112root@remote# ipadm create-ip net3113root@remote# root@remote# ipadm create-addr -a 192.168.111.1/24 net3111net3111/v4root@remote# ipadm create-addr -a 192.168.112.1/24 net3112net3112/v4root@remote# ipadm create-addr -a 192.168.113.1/24 net3113net3113/v4root@remote# root@remote# ipadm show-addrADDROBJ TYPE STATE ADDRlo0/v4 static ok 127.0.0.1/8net0/v4 static ok 172.16.1.1/22net3111/v4 static ok 192.168.111.1/24net3112/v4 static ok 192.168.112.1/24net3113/v4 static ok 192.168.113.1/24lo0/v6 static ok ::1/128net0/v6 addrconf ok fe80::214:4fff:feac:57c4/10net0/v6 addrconf ok 2606:b400:602:c080:214:4fff:feac:57c4/64root@remote# The remote system setup is complete.Creating a Link AggregationOn the SPARC T4-1 running Solaris 11.3, I will first create an aggregation and test is in the Control/Service Domain. I will use interfaces 1 and 3 on the system, since those are using two different physical chips on the system motherboard. In production, they likely are ports on two different NICs.root@cdom# dladm create-aggr -l net1 -l net3 aggr1root@cdom# root@cdom# dladm show-aggrLINK MODE POLICY ADDRPOLICY LACPACTIVITY LACPTIMERaggr1 trunk L4 auto off shortroot@cdom# root@cdom# dladm show-aggr -PLINK MODE POLICY ADDRPOLICY LACPACTIVITY LACPTIMERaggr1 trunk L4 auto off shortroot@cdom# dladm show-aggr -LLINK PORT AGGREGATABLE SYNC COLL DIST DEFAULTED EXPIREDaggr1 net1 no no no no no no-- net3 no no no no no noroot@cdom# dladm show-aggr -xLINK PORT SPEED DUPLEX STATE ADDRESS PORTSTATEaggr1 -- 1000Mb full up 0:21:28:d2:17:f9 -- net1 1000Mb full up 0:21:28:d2:17:f9 attached net3 1000Mb full up 0:21:28:d2:17:fb attachedroot@cdom# I show different outputs of the dladm(1M) command here, and we'll see some differences later.The aggregation is on a private network on the Netgear switch, so a snoop will not show a lot of traffic. I will generate some traffic using ping, and I will be switching between the two systems for that.root@remote# ping 192.168.111.101 2no answer from 192.168.111.101root@remote# ping 192.168.112.101 2no answer from 192.168.112.101root@remote# ping 192.168.113.101 2no answer from 192.168.113.101root@remote# To keep output short, and make testing faster, I only sent two packets per IP address, since I know there is not going to be an answer. So what does this look like on the system with the aggregation?root@cdom# snoop -d aggr1Using device aggr1 (promiscuous mode)VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.101, 192.168.111.101 ?VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.101, 192.168.111.101 ?VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.1, 192.168.111.1 ?VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.101, 192.168.111.101 ?VLAN#112: 192.168.112.1 -> (broadcast) ARP C Who is 192.168.112.1, 192.168.112.1 ?VLAN#112: 192.168.112.1 -> (broadcast) ARP C Who is 192.168.112.101, 192.168.112.101 ?VLAN#112: 192.168.112.1 -> (broadcast) ARP C Who is 192.168.112.101, 192.168.112.101 ?VLAN#112: 192.168.112.1 -> (broadcast) ARP C Who is 192.168.112.101, 192.168.112.101 ?VLAN#113: 192.168.113.1 -> (broadcast) ARP C Who is 192.168.113.1, 192.168.113.1 ?VLAN#113: 192.168.113.1 -> (broadcast) ARP C Who is 192.168.113.101, 192.168.113.101 ?VLAN#113: 192.168.113.1 -> (broadcast) ARP C Who is 192.168.113.101, 192.168.113.101 ?VLAN#113: 192.168.113.1 -> (broadcast) ARP C Who is 192.168.113.101, 192.168.113.101 ?First notice that each line includes the VLAN ID. This is a new feature in Solaris 11, and may have been back ported to a late update of Solaris 10 (I will have to check and come back to that.)You can see the ARP requests for all three VLANs with the target address on each. This is why I like to have the VLAN ID and the subnet the same. I am beginning to notice this with some customers as well.My first test is to bring down one or both ports and see the changes in the aggregation and the network. After turning off the port of net1, this is how things look.root@cdom# dladm show-aggr -xLINK PORT SPEED DUPLEX STATE ADDRESS PORTSTATEaggr1 -- 1000Mb full up 0:21:28:d2:17:f9 -- net1 0Mb unknown down 0:21:28:d2:17:f9 standby net3 1000Mb full up 0:21:28:d2:17:fb attachedroot@cdom# root@cdom# snoop -d aggr1Using device aggr1 (promiscuous mode)VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.101, 192.168.111.101 ?VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.101, 192.168.111.101 ?VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.101, 192.168.111.101 ?^Croot@cdom# The aggregation stays up, and traffic continues to come into the system. Port status changes are also in /var/adm/messages. I don't see anything going to the console, however. Messages are limited as the aggregation is not plumbed nor IP is using it, even when both ports are down.root@cdom# dladm show-aggr -xLINK PORT SPEED DUPLEX STATE ADDRESS PORTSTATEaggr1 -- 0Mb unknown down 0:21:28:d2:17:f9 -- net1 0Mb unknown down 0:21:28:d2:17:f9 standby net3 0Mb unknown down 0:21:28:d2:17:fb standbyroot@cdom# root@cdom# tail /var/adm/messages...Apr 21 11:00:07 gravity mac: [ID 486395 kern.info] NOTICE: igb3 link downApr 21 11:05:40 gravity mac: [ID 486395 kern.info] NOTICE: igb1 link downApr 21 11:05:40 gravity mac: [ID 486395 kern.info] NOTICE: aggr1 link downroot@cdom# To better see network connectivity, I will create a VLAN and configure an IP address.root@cdom# dladm create-vlan -l aggr1 -v 111 aggr1111root@cdom# ipadm create-ip aggr1111root@cdom# ipadm create-addr -a 192.168.111.5/24 aggr1111aggr1111/v4root@cdom# root@cdom# ipadm show-addrADDROBJ TYPE STATE ADDRlo0/v4 static ok 127.0.0.1/8net0/v4 static ok 192.168.1.5/22net4/v4 static ok 169.254.182.77/24aggr1111/v4 static ok 192.168.111.5/24lo0/v6 static ok ::1/128net0/v6 addrconf ok fe80::221:28ff:fed2:17f8/10net0/v6 addrconf ok 2606:b400:602:c080:221:28ff:fed2:17f8/64root@cdom# The networks works as seen from the remote system.root@remote# ping 192.168.111.5 2192.168.111.5 is aliveroot@remote# I will first bring one port down, then both.root@cdom# dladm show-aggr -xLINK PORT SPEED DUPLEX STATE ADDRESS PORTSTATEaggr1 -- 1000Mb full up 0:21:28:d2:17:f9 -- net1 1000Mb full up 0:21:28:d2:17:f9 attached net3 1000Mb full up 0:21:28:d2:17:fb attachedroot@cdom# root@cdom# dladm show-aggr -xLINK PORT SPEED DUPLEX STATE ADDRESS PORTSTATEaggr1 -- 1000Mb full up 0:21:28:d2:17:f9 -- net1 0Mb unknown down 0:21:28:d2:17:f9 standby net3 1000Mb full up 0:21:28:d2:17:fb attachedroot@cdom# root@remote# ping 192.168.111.5 2192.168.111.5 is aliveroot@remote# root@cdom# dladm show-aggr -xLINK PORT SPEED DUPLEX STATE ADDRESS PORTSTATEaggr1 -- 0Mb unknown down 0:21:28:d2:17:f9 -- net1 0Mb unknown down 0:21:28:d2:17:f9 standby net3 0Mb unknown down 0:21:28:d2:17:fb standbyroot@cdom# root@cdom# ipadm show-addrADDROBJ TYPE STATE ADDRlo0/v4 static ok 127.0.0.1/8net0/v4 static ok 192.168.1.5/22net4/v4 static ok 169.254.182.77/24aggr1111/v4 static inaccessible 192.168.111.5/24lo0/v6 static ok ::1/128net0/v6 addrconf ok fe80::221:28ff:fed2:17f8/10net0/v6 addrconf ok 2606:b400:602:c080:221:28ff:fed2:17f8/64root@cdom# root@remote# ping 192.168.111.5 2no answer from 192.168.111.5root@remote# With this I have created an aggregation, shown VLAN, and demonstrated what happens when one or both ports in the aggregation fail. The aggregation remains functional with one port failure and networking continues. The aggregation fails with both ports down, and the IP address shows to be inaccessible.Setting up the Virtual Switch in a Service DomainI use the terms Service Domain and Control Domain to refer to the specific function I am doing or working on. On this system, there is only one Service Domain, and it is also the Control Domain. The concepts and steps I am outlining here apply also to second or redundant Service Domains when a system is configured with more than one.This is an area where there are differences between Solaris 11 and Solaris 10 Service Domains. Oracle highly recommends that all Service Domains are running Solaris 11.I will be creating a virtual switch on top of the aggr1 data link while the existing VLANs are already there. If this were Solaris 10, I'd likely remove them and if I need Service Domain access to the aggregation, I would use the virtual switch.In Solaris 11, there is no need to set or modify the pvid and vid parameters on the virtual switch. If this were Solaris 10 and I wanted to get access to VLANs on the data link (in this case aggr1) I would need to set those.Let us get started on the virtual switch.root@cdom# ldm add-vsw net-dev=aggr1 primary-vsw1 primaryroot@cdom# root@cdom# ldm list-services...VSW NAME LDOM MAC NET-DEV ID DEVICE LINKPROP DEFAULT-VLAN-ID PVID VID MTU MODE INTER-VNET-LINK primary-vsw0 primary 00:14:4f:f9:b4:9f net0 0 switch@0 1 1 1500 on primary-vsw1 primary 00:14:4f:f8:44:87 aggr1 1 switch@1 1 1 1500 on ...root@cdom# root@cdom# ldm listNAME STATE FLAGS CONS VCPU MEMORY UTIL NORM UPTIMEprimary active -n-cv- UART 16 7680M 0.2% 0.2% 69d 17h 38mhost1 active -n---- 5000 8 4G 0.2% 0.2% 5d 21h 23mroot@cdom# There is an existing Guest Domain on the system, and I will use that one to demonstrate the networking. I will do this in steps, to cover a range of LDom networking items. First item is to create a new virtual network device (vnet) for the Guest Domain. I show before and after.guest@host1:~$ pfbash [1]guest@host1:~$ guest@host1:~$ PS1="guest-pf@host1$ "guest-pf@host1$guest-pf@host1$ dladm show-phys LINK MEDIA STATE SPEED DUPLEX DEVICEnet0 Ethernet up 0 unknown vnet0guest-pf@host1$ guest-pf@host1$ dladm show-linkLINK CLASS MTU STATE OVERnet0 phys 1500 up --guest-pf@host1$ root@cdom# ldm add-vnet linkprop=phys-state vnet1 primary-vsw1 host1root@cdom# guest-pf@host1$ dladm show-physLINK MEDIA STATE SPEED DUPLEX DEVICEnet0 Ethernet up 0 unknown vnet0net1 Ethernet unknown 0 unknown vnet1guest-pf@host1$ In the background, on the remote system, I have steady ping(1) running on all three subnets. A snoop shows no traffic (I won't bother to show "nothing" here (how do I add a smiley?) So we have successfully added a virtualized network interface where the underlying data link is an aggregation.[1] You may wonder what I did here. If you don't, skip to the next section.I am using Solaris' Role Based Access Control feature, where this user has been given a lot of privileges. I could just do an su(1) to root. Instead, I am running as the user in the profile shell version of bash. Every command is then checked for authorization. This is easier than running "pfexec command", for those familiar with sudo, "sudo command". The pfexec(1) command does not prompt for a password.Starting to Work with VLANsI keep the snoop running while I add a VLAN ID to the vnet, as a "vid", which means it will also show up in the Guest Domain with the VLAN tag.root@cdom# ldm set-vnet vid=111 vnet1 host1root@cdom# guest-pf@host1$ snoop -d net1Using device net1 (promiscuous mode)VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.6, 192.168.111.6 ?VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.6, 192.168.111.6 ?^Cguest-pf@host1$ Not easy to show here, but almost immediately after running the set-vnet command snoop sees traffic on VLAN 111. Just as expected. The pings/ARPs on the other two VLANs are still not coming through, as they are not assigned to the vnet. Now I will add VLAN 112 also as a vid, and I will add VLAN 113 as a pvid. That means 113 traffic will come in with the VLAN tag removed.root@cdom# ldm set-vnet vid=111,112 pvid=113 vnet1 host1root@cdom# guest-pf@host1$ snoop -d net1Using device net1 (promiscuous mode)VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.6, 192.168.111.6 ?VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.6, 192.168.111.6 ?VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.6, 192.168.111.6 ?VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.6, 192.168.111.6 ?VLAN#112: 192.168.112.1 -> (broadcast) ARP C Who is 192.168.112.6, 192.168.112.6 ?192.168.113.1 -> (broadcast) ARP C Who is 192.168.113.6, 192.168.113.6 ?VLAN#112: 192.168.112.1 -> (broadcast) ARP C Who is 192.168.112.6, 192.168.112.6 ?VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.6, 192.168.111.6 ?192.168.113.1 -> (broadcast) ARP C Who is 192.168.113.6, 192.168.113.6 ?VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.6, 192.168.111.6 ?VLAN#112: 192.168.112.1 -> (broadcast) ARP C Who is 192.168.112.6, 192.168.112.6 ?VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.6, 192.168.111.6 ?192.168.113.1 -> (broadcast) ARP C Who is 192.168.113.6, 192.168.113.6 ?VLAN#112: 192.168.112.1 -> (broadcast) ARP C Who is 192.168.112.6, 192.168.112.6 ?^Cguest-pf@host1$ As I expected, the VLAN 111 traffic continues, and then I see VLAN 112 also tagged (the VLAN#111: prefix) and the VLAN 113 traffic untagged. Easy to see because of the third octet in each IP address.Note that I can add or remove VLANs while the interface is running in the Guest Domain.Testing Link Failure with LDomsThe next step it to show how a failure of one or both ports in the aggregation affects the Guest Domain. I add an IP address in the Guest, and then turn off one, then both ports.guest-pf@host1$ dladm show-physLINK MEDIA STATE SPEED DUPLEX DEVICEnet0 Ethernet up 0 unknown vnet0net1 Ethernet down 0 unknown vnet1guest-pf@host1$ guest-pf@host1$ ipadm create-ip net1guest-pf@host1$ ipadm create-addr -a 192.168.113.6/24 net1net1/v4guest-pf@host1$ guest-pf@host1$ ipadm show-addrADDROBJ TYPE STATE ADDRlo0/v4 static ok 127.0.0.1/8net0/v4 static ok 192.168.1.6/22net1/v4 static ok 192.168.113.6/24lo0/v6 static ok ::1/128net0/v6 addrconf ok fe80::214:4fff:fef9:fc75/10net0/v6 addrconf ok 2606:b400:602:c080:214:4fff:fef9:fc75/64guest-pf@host1$ guest-pf@host1$ dladm show-physLINK MEDIA STATE SPEED DUPLEX DEVICEnet0 Ethernet up 0 unknown vnet0net1 Ethernet up 0 unknown vnet1guest-pf@host1$ guest-pf@host1$ snoop -d net1Using device net1 (promiscuous mode)VLAN#112: 192.168.112.1 -> (broadcast) ARP C Who is 192.168.112.6, 192.168.112.6 ?VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.6, 192.168.111.6 ?192.168.113.1 -> 192.168.113.6 ICMP Echo request (ID: 26169 Sequence number: 8405)192.168.113.6 -> 192.168.113.1 ICMP Echo reply (ID: 26169 Sequence number: 8405)VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.6, 192.168.111.6 ?VLAN#112: 192.168.112.1 -> (broadcast) ARP C Who is 192.168.112.6, 192.168.112.6 ?VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.6, 192.168.111.6 ?192.168.113.1 -> 192.168.113.6 ICMP Echo request (ID: 26169 Sequence number: 8406)192.168.113.6 -> 192.168.113.1 ICMP Echo reply (ID: 26169 Sequence number: 8406)VLAN#112: 192.168.112.1 -> (broadcast) ARP C Who is 192.168.112.6, 192.168.112.6 ?^Cguest-pf@host1$ guest-pf@host1$ After configuring the address, I can see ICMP echo and reply messages to that address. I chose to do this on VLAN 113, which is untagged, however, the same would work if I create a VLAN data link. I will show that on a different step.I quickly tested the configuration with one port on the switch down, and in the guest it looks the same. Then I marked the second port down. This makes the aggregation down.root@cdom# dladm show-aggr -xLINK PORT SPEED DUPLEX STATE ADDRESS PORTSTATEaggr1 -- 0Mb unknown down 0:21:28:d2:17:f9 -- net1 0Mb unknown down 0:21:28:d2:17:f9 standby net3 0Mb unknown down 0:21:28:d2:17:fb standbyroot@cdom# So how does this look in the Guest Domain? You can see that it also sees the virtual network interface down. How did that happen? I'll explain shortly.guest-pf@host1$ dladm show-physLINK MEDIA STATE SPEED DUPLEX DEVICEnet0 Ethernet up 0 unknown vnet0net1 Ethernet down 0 unknown vnet1guest-pf@host1$ guest-pf@host1$ ipadm show-addrADDROBJ TYPE STATE ADDRlo0/v4 static ok 127.0.0.1/8net0/v4 static ok 192.168.1.6/22net1/v4 static inaccessible 192.168.113.6/24lo0/v6 static ok ::1/128net0/v6 addrconf ok fe80::214:4fff:fef9:fc75/10net0/v6 addrconf ok 2606:b400:602:c080:214:4fff:fef9:fc75/64guest-pf@host1$ guest-pf@host1$ snoop -d net1Using device net1 (promiscuous mode)^Cguest-pf@host1$ guest-pf@host1$ You may have noticed that when I set up the vnet in the Service Domain, I used an option "linkprop=phys-state". This LDom option uses an out of band protocol to pass the link state of the underlying data link to the guest. Without this, because there is a virtual switch between the physical data link or aggregation and the virtual network interface (vnet), the latter would not see a hardware failure. It can still communicate with other vnets on the same virtual switch. This link state propagation was added to LDoms a number of years ago.To demonstrate, I will turn linkprop off, and then look at the interface in the Guest Domain.root@cdom# ldm set-vnet linkprop="" vnet1 host1root@cdom# guest-pf@host1$ dladm show-physLINK MEDIA STATE SPEED DUPLEX DEVICEnet0 Ethernet up 0 unknown vnet0net1 Ethernet up 0 unknown vnet1guest-pf@host1$ guest-pf@host1$ ipadm show-addrADDROBJ TYPE STATE ADDRlo0/v4 static ok 127.0.0.1/8net0/v4 static ok 192.168.1.6/22net1/v4 static ok 192.168.113.6/24lo0/v6 static ok ::1/128net0/v6 addrconf ok fe80::214:4fff:fef9:fc75/10net0/v6 addrconf ok 2606:b400:602:c080:214:4fff:fef9:fc75/64guest-pf@host1$ The Guest thinks the link is working. A snoop was completely quiet. I'll turn linkprop back on, and then enable the ports again to put everything into a working state. Behind the scenes I see my ping showing success on the remote system. Another validation that the network is working again.Using Solaris Virtual Network Interfaces (VNICs) in LDomsMust customers using LDoms are also using Solaris Zones. In Solaris 11 is a key feature, network virtualization. This allows a user, or the Solaris Zones framework, to create individual virtual NICs (VNICs) for Zones, making consolidation much easier and the Zones behave more as if they are different systems with their own networking hardware. Before moving on to Zones, I'd like to test this with a VNIC manually.Let's give it a try.guest-pf@host1$ dladm show-physLINK MEDIA STATE SPEED DUPLEX DEVICEnet0 Ethernet up 0 unknown vnet0net1 Ethernet up 0 unknown vnet1guest-pf@host1$ guest-pf@host1$ dladm show-phys -mLINK SLOT ADDRESS INUSE CLIENTnet0 primary 0:14:4f:f9:fc:75 yes net0 1 0:14:4f:fb:a1:78 no -- 2 0:14:4f:f8:f9:32 no -- 3 0:14:4f:f9:ab:37 no -- 4 0:14:4f:f8:1:93 no --net1 primary 0:14:4f:f8:3e:e5 yes net1guest-pf@host1$ guest-pf@host1$ dladm create-vnic -l net1 vnic11dladm: vnic creation failed: operation not supportedguest-pf@host1$ guest-pf@host1$ dladm create-vnic -l net0 vnic1guest-pf@host1$ guest-pf@host1$ dladm show-phys -mLINK SLOT ADDRESS INUSE CLIENTnet0 primary 0:14:4f:f9:fc:75 yes net0 1 0:14:4f:fb:a1:78 yes vnic1 2 0:14:4f:f8:f9:32 no -- 3 0:14:4f:f9:ab:37 no -- 4 0:14:4f:f8:1:93 no --net1 primary 0:14:4f:f8:3e:e5 yes net1guest-pf@host1$ Oops. Creating a VNIC on net1 failed. Why is that? Turns out each vnic needs its own MAC, since it will have its own IP address on it--this is definitely the case in a Zone. However, the underlying "physical" interface, in this case a vnet only has one MAC address. And while on an actual physical interface it is possible to add more MAC addresses, through some device driver mechanics, this is not possible on a vnet.This is also why I chose to show VNICs outside of Zones. If we had gone straight to Zone creating and start-up, this failure might be harder to track down.Several years ago LDoms added a new feature to assign additional MAC addresses to a vnet. The property is called "alt-mac-addrs". It allows a fixed number of MAC addresses to be assigned to the vnet. Unfortunately, this vnet property can not be set or changed when a Guest Domain is running. So I will shut the Guest down.guest-pf@host1$ init 5updating /platform/sun4v/boot_archiveguest-pf@host1$ root@cdom# ldm set-vnet alt-mac-addrs=auto,auto,auto,auto,auto,auto vnet1 host1Please perform the operation while the LDom is bound or inactiveroot@cdom# root@cdom# ldm listNAME STATE FLAGS CONS VCPU MEMORY UTIL NORM UPTIMEprimary active -n-cv- UART 16 7680M 1.3% 1.3% 69d 22h 15mhost1 active -n---- 5000 8 4G 0.1% 0.1% 3h 54mroot@cdom# root@cdom# ldm listNAME STATE FLAGS CONS VCPU MEMORY UTIL NORM UPTIMEprimary active -n-cv- UART 16 7680M 0.5% 0.5% 69d 22h 16mhost1 bound ------ 5000 8 4G root@cdom# root@cdom# ldm set-vnet alt-mac-addrs=auto,auto,auto,auto,auto,auto vnet1 host1root@cdom# root@cdom# ldm start host1LDom host1 startedroot@cdom# I show the error message when I tried to change the vnet while the Guest Domain is running. Once it was stopped, the operation was successful. You may notice that I list six time the work "auto". I am adding six MAC addresses to the vnet. And I am allowing each MAC address to be automatically generated. If I need to keep MAC addresses across configurations, I can set the explicitly.Once the Guest Domain is back up, and can see what things look like now.guest-pf@host1$ dladm show-phys -mLINK SLOT ADDRESS INUSE CLIENTnet0 primary 0:14:4f:f9:fc:75 yes net0 1 0:14:4f:fb:a1:78 yes vnic1 2 0:14:4f:f8:f9:32 no -- 3 0:14:4f:f9:ab:37 no -- 4 0:14:4f:f8:1:93 no --net1 primary 0:14:4f:f8:3e:e5 yes net1 1 0:14:4f:fa:a6:5e no -- 2 0:14:4f:f8:92:c0 no -- 3 0:14:4f:f9:77:8c no -- 4 0:14:4f:fb:d8:33 no -- 5 0:14:4f:f8:50:1 no -- 6 0:14:4f:fa:bc:2d no --guest-pf@host1$ Here you see the six MAC addresses on the second interface. That is one reason I chose a number other than my typical four MACs.This time the operation to create a VNIC on net1 should succeed.guest-pf@host1$ dladm create-vnic -l net1 vnic11guest-pf@host1$ guest-pf@host1$ dladm show-phys -mLINK SLOT ADDRESS INUSE CLIENTnet0 primary 0:14:4f:f9:fc:75 yes net0 1 0:14:4f:fb:a1:78 yes vnic1 2 0:14:4f:f8:f9:32 no -- 3 0:14:4f:f9:ab:37 no -- 4 0:14:4f:f8:1:93 no --net1 primary 0:14:4f:f8:3e:e5 yes net1 1 0:14:4f:fa:a6:5e yes vnic11 2 0:14:4f:f8:92:c0 no -- 3 0:14:4f:f9:77:8c no -- 4 0:14:4f:fb:d8:33 no -- 5 0:14:4f:f8:50:1 no -- 6 0:14:4f:fa:bc:2d no --guest-pf@host1$ guest-pf@host1$ dladm show-vnicLINK OVER SPEED MACADDRESS MACADDRTYPE IDSvnic1 net0 0 0:14:4f:fb:a1:78 factory, slot 1 VID:0vnic11 net1 0 0:14:4f:fa:a6:5e factory, slot 1 VID:0guest-pf@host1$ Success indeed. I will get rid of the VNIC on net0 to simplify output.guest-pf@host1$ dladm delete-vnic vnic1guest-pf@host1$ guest-pf@host1$ dladm show-vnicLINK OVER SPEED MACADDRESS MACADDRTYPE IDSvnic11 net1 0 0:14:4f:fa:a6:5e factory, slot 1 VID:0guest-pf@host1$ Before moving on to Zones, I want to show two things. Creating a interface on a VLAN, and showing that full aggregation failure also propagates to the VNIC.There are two types of operations: one VLANs; and on VNICs. When creating a VNIC I can specify a VLAN ID, so I can show both in a single operation.guest-pf@host1$ dladm create-vnic -l net1 -v 111 vnic1111guest-pf@host1$ guest-pf@host1$ dladm show-vnicLINK OVER SPEED MACADDRESS MACADDRTYPE IDSvnic11 net1 0 0:14:4f:fa:a6:5e factory, slot 1 VID:0vnic1111 net1 0 0:14:4f:f8:92:c0 factory, slot 2 VID:111guest-pf@host1$ dladm show-vlanguest-pf@host1$ guest-pf@host1$ ipadm create-ip vnic1111guest-pf@host1$ ipadm create-addr -a 192.168.111.6/24 vnic1111vnic1111/v4guest-pf@host1$ guest-pf@host1$ snoop -d net1Using device net1 (promiscuous mode)VLAN#112: 192.168.112.1 -> (broadcast) ARP C Who is 192.168.112.6, 192.168.112.6 ?VLAN#111: 192.168.111.1 -> 192.168.111.6 ICMP Echo request (ID: 26167 Sequence number: 13612)VLAN#111: 192.168.111.6 -> 192.168.111.1 ICMP Echo reply (ID: 26167 Sequence number: 13612)192.168.113.1 -> 192.168.113.6 ICMP Echo request (ID: 26169 Sequence number: 13601)192.168.113.6 -> 192.168.113.1 ICMP Echo reply (ID: 26169 Sequence number: 13601)VLAN#111: 192.168.111.6 -> (broadcast) ARP C Who is 192.168.111.6, 192.168.111.6 ?VLAN#112: 192.168.112.1 -> (broadcast) ARP C Who is 192.168.112.6, 192.168.112.6 ?VLAN#111: 192.168.111.1 -> 192.168.111.6 ICMP Echo request (ID: 26167 Sequence number: 13613)VLAN#111: 192.168.111.6 -> 192.168.111.1 ICMP Echo reply (ID: 26167 Sequence number: 13613)192.168.113.1 -> 192.168.113.6 ICMP Echo request (ID: 26169 Sequence number: 13602)192.168.113.6 -> 192.168.113.1 ICMP Echo reply (ID: 26169 Sequence number: 13602)VLAN#112: 192.168.112.1 -> (broadcast) ARP C Who is 192.168.112.6, 192.168.112.6 ?^Cguest-pf@host1$ Here I created a VNIC on top of net1 with VLAN ID 111. I can see those details with dladm(1M).And snoop now shows that pings are working on 192.168.113.6 and now 192.168.111.6. Now I will disable both interfaces on the switch.guest-pf@host1$ dladm show-physLINK MEDIA STATE SPEED DUPLEX DEVICEnet0 Ethernet up 0 unknown vnet0net1 Ethernet down 0 unknown vnet1guest-pf@host1$ guest-pf@host1$ dladm show-linkLINK CLASS MTU STATE OVERnet0 phys 1500 up --net1 phys 1500 up --vnic11 vnic 1500 up net1vnic1111 vnic 1500 down net1guest-pf@host1$ guest-pf@host1$ ipadm show-addrADDROBJ TYPE STATE ADDRlo0/v4 static ok 127.0.0.1/8net0/v4 static ok 192.168.1.6/22net1/v4 static ok 192.168.113.6/24vnic1111/v4 static inaccessible 192.168.111.6/24lo0/v6 static ok ::1/128net0/v6 addrconf ok fe80::214:4fff:fef9:fc75/10net0/v6 addrconf ok 2606:b400:602:c080:214:4fff:fef9:fc75/64guest-pf@host1$ guest-pf@host1$ snoop -d net1Using device net1 (promiscuous mode)^Cguest-pf@host1$ I am a bit stumped. The VNIC on net1 is showing the interfaces are down, however, the base interface is not. I see that both at the data link layer with dladm and the IP layer with ipadm. I thought this may be a bug, however, Solaris network engineering is saying this is expected behavior when only one VNIC is up. The VNICs can still be used to communicate with each other, even though the underlying data link is down. As would be the case with any switch were the uplink is down. Hosts can still communicate.Note: I may come back to this later and update details.Let us move on to Zones.Using the LDoms and Solaris Zones Network Virtualization Features TogetherNow I would like to combine all the features into creating a Zone. The Link Aggregation is being handled by the Service Domain. This is really convenient, as all LDoms and Zones will benefit from he increased availability of the aggregation. And since each VNIC has its own MAC address, inbound traffic that is hashed at Layer 2 may still have its load spread across the member links in the aggregation. Solaris' load spreading is at L4, using TCP or UDP headers, so it is already likely to spread.I will not focus on the mechanics of creating a Solaris Zone here. Others and I have done that elsewhere. However, the network details of the Zone configuration are important to highlight.guest-pf@host1$ zonecfg -z myzone info anetanet:linkname: net0lower-link: net1allowed-address not specifiedconfigure-allowed-address: truedefrouter not specifiedallowed-dhcp-cids not specifiedlink-protection: mac-nospoofmac-address: automac-prefix not specifiedmac-slot not specifiedvlan-id not specifiedpriority not specifiedrxrings not specifiedtxrings not specifiedmtu not specifiedmaxbw not specifiedbwshare not specifiedrxfanout not specifiedvsi-typeid not specifiedvsi-vers not specifiedvsi-mgrid not specifiedetsbw-lcl not specifiedcos not specifiedpkey not specifiedlinkmode not specifiedevs not specifiedvport not specifiedanet:linkname: net1lower-link: net1...vlan-id: 111...anet:linkname: net2lower-link: net1...vlan-id: 112...guest-pf@host1$ Each network section is started with "anet" for Automated Network. This feature in Solaris 11 will create a VNIC for each entry when the Zone boots, and will remove it when the Zone halts. This simplifies Zone networks and limits the privileges an administrator needs to those for Zone Configuration. The user "guest" has those privileges.The link "net0" had the defaults, and is using the net1 interface. Since "vlan-id" is not specified, it will use the untagged inteface, or VLAN 113.The other two interfaces, net1 and net2 will use VLAN IDs 111 and 112, respectively.Because I did not give guest all Zone privileges, I perform a few operations here as root. User guest can start and stop Zones, and also log into the Zone.guest-pf@host1$ suPassword: root@host1:~# root@host1:~# zonecfg -z myzone -f myzone.cfg UX: /usr/sbin/usermod: guest is currently logged in, some changes may not take effect until next login.root@host1:~# root@host1:~# zoneadm -z myzone install -c myzone.xml The following ZFS file system(s) have been created: rpool/zones rpool/zones/myzoneProgress being logged to /var/log/zones/zoneadm.20160421T225323Z.myzone.install Image: Preparing at /zones/myzone/root. Install Log: /system/volatile/install.1585/install_log AI Manifest: /tmp/manifest.xml.P6aOed SC Profile: /export/home/guest/myzone.xml Zonename: myzoneInstallation: Starting ... Creating IPS imageStartup linked: 1/1 done Installing packages from: solaris origin: http://172.16.1.1/DOWNLOAD PKGS FILES XFER (MB) SPEEDCompleted 279/279 48306/48306 354.4/354.4 1.6M/sPHASE ITEMSInstalling new actions 66017/66017Updating package state database Done Updating package cache 0/0 Updating image state Done Creating fast lookup database Done Updating package cache 1/1 Installation: Succeeded Note: Man pages can be obtained by installing pkg:/system/manual done. Done: Installation completed in 431.151 seconds. Next Steps: Boot the zone, then log into the zone console (zlogin -C) to complete the configuration process.Log saved in non-global zone as /zones/myzone/root/var/log/zones/zoneadm.20160421T225323Z.myzone.installroot@host1:~#root@host1:~# exitexitguest-pf@host1$ guest-pf@host1$ zoneadm -z myzone bootguest-pf@host1$ I save myself a few steps with a System Configuration File that sets the hostname, IP addresses, and the like, so I am not prompted for that information the first time it boots.guest-pf@host1$ zlogin myzone[Connected to zone 'myzone' pts/2]Last login: Thu Apr 21 19:15:02 2016 on pts/2Oracle CorporationSunOS 5.1111.3February 2016root@myzone:~# root@myzone:~# dladm show-physroot@myzone:~# dladm show-linkLINK CLASS MTU STATE OVERnet2 vnic 1500 up ?net1 vnic 1500 up ?net0 vnic 1500 up ?root@myzone:~# root@myzone:~# ipadm show-addrADDROBJ TYPE STATE ADDRlo0/v4 static ok 127.0.0.1/8net0/v4 static ok 192.168.113.16/24lo0/v6 static ok ::1/128net0/v6 addrconf ok fe80::214:4fff:fef8:5001/10root@myzone:~# root@myzone:~# ping 192.168.113.1192.168.113.1 is aliveroot@myzone:~# root@myzone:~# snoop -d net0Using device net0 (promiscuous mode)^Croot@myzone:~# root@myzone:~# root@myzone:~# snoop -d net1Using device net1 (promiscuous mode)192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.2, 192.168.111.2 ?192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.2, 192.168.111.2 ?^Croot@myzone:~# root@myzone:~# snoop -d net2Using device net2 (promiscuous mode)192.168.112.1 -> (broadcast) ARP C Who is 192.168.112.6, 192.168.112.6 ?192.168.112.1 -> (broadcast) ARP C Who is 192.168.112.6, 192.168.112.6 ?^Croot@myzone:~# While the 113 VLAN on net0 is relatively quiet (all ping attempts are being met and thus no broadcasting is going on), there is traffic visible on VLANs 111 and 112. What you may note here is that the VNICs are bringing data into the Zone without the VLAN tags. At this time only one VLAN ID can be set for a VNIC, and thus there is no need bring in the tag, and it actually hides some network details and complexity from the Zone.I think the final item I want to show is the link failure as seen in the Zone.root@myzone:~# dladm show-linkLINK CLASS MTU STATE OVERnet2 vnic 1500 down ?net1 vnic 1500 up ?net0 vnic 1500 up ?Again, not all VNICs are showing they are down. What does it look like in the Global Zone?guest-pf@host1$ dladm show-linkLINK CLASS MTU STATE OVERnet0 phys 1500 up --net1 phys 1500 up --vnic11 vnic 1500 up net1vnic1111 vnic 1500 up net1myzone/net2 vnic 1500 down net1myzone/net1 vnic 1500 up net1myzone/net0 vnic 1500 up net1guest-pf@host1$ Also here, only one VNIC is showing the link is down. Here you can see another benefit of using the anet feature. Each VNIC of the zone is identefied with the Zone name's prefix.Wrapping Things UpSo we have gone over the following itemsCreating an aggregation in Solaris 11Creating a VLAN on an aggregationShowing what happens when link(s) failCreating an LDom virtual switch in a Solaris 11 Service DomainAdding a virtual network (vnet) interface to an LDom Guest DomainConfiguring and testing VLANs on the vnetDemonstrating link failure propagation with an LDom vnetCreating a Solaris 11 VNIC in a Guest DomainShow how Zones use VNICs and VLANsWow, that was a lot of territory. I thought it took a while.I hope it is useful for you!Regards,SteffenAppreciationsThanks to Nicolas Droux for a quick reply to my question on the VNIC behavior when the link is down, and his ongoing internal answers to my deeper Solaris networking questions.Thanks to Jeff Savit for a quick review and editorial suggestions. He and I discussed the need this topic several times.Revision History(Other than minor typographical changes)2016.04.22: Posted2016.04.21: Created

Often people ask how to use link aggregations and VLANs with Oracle VM Server for SPARC (Logical Domain or LDoms). My goal here is to give a brief description and steps how to configure a...

Solaris

Configuring Secure NFS in Solaris 11

pre { display: block; font-family: monospace; white-space: pre; margin: 1em 0; background: lightgreen;} OverviewRecently a customer asks for suggestions how to transfer files security, meaning the data is encrypted while on the wire. Several options came to mind.scpIPsecSecure NFSUsing scp(1M) seems the most straightforward since ssh/scp/sftp is installed in almost all Solaris 11.2 Package Groups (link here.) Only downside is that scp is run from the command line or a script.The benefit of IPsec is that all traffic can be encrypted. However, it requires either manual keying or an IKE v2 infrastructure.Secure NFS is an extension of typical NFS setups, so should be rather simple, if remote file access works for the customer.After meeting with the customer it turns out the key application that needs secure file transfer only support ftp(1M) (and not sftp or even ftps) or NFS. Thus looking closer at Secure NFS seemed the logical path. Using IPsec to only secure NFS traffic looked to be more work than necessary.My setup uses Solaris Zones for a number of reasons.I can do this on a single system.They are easy to build, manage, and delete.I can see traffic between Zones easily, even if they are on the same system.And the customer is already using Zones, so this will show that they can do this using Zones as well.The steps that follow are done on a system running Solaris 11.2. For convenience, I will refer to it as Solaris or Solaris 11. Due to possible packaging differences, Solaris 11 11/11 or Solaris 11.1 might require some changes. I have not tested these steps with either.Building a Secure NFS configuration consists of the following steps:NTP to insure all nodes' clocks are in syncDNS (optional if you have a DNS setup you can access and customize, if necessary)KDC to install the Kerberos Key Distribution CenterKerberos clientNFS server also as Kerberos clientKDC as NFS server to show a single node as both KDC and NFS server (optional)To make this easier to follow, and to write, I am breaking this into several different steps.Step O, as in Optional--NTP and DNS Step 1--Setting Up the Kerberos KDC Step 2--First Keberos Client--NFS Server Step 3--The Secure NFS Client Step 4--Combining the KDC and NFS Server Click on either Step O or Step 1 to get started!When you done, come back here for the comments below, if you wish.Wrapping Things UpThis pretty much does it. In summary, we have done the following:Created and configured a DNS servers since I don't have one available. This might be optional for you.Created a Kerberos Key Distribution Center (KDC) for all clients and servers to use.Built an NFS server that is a Kerberos client, and added a share that requires Kerberos privacy through encryption.Added an NFS client and verified that the data passed over the network is ecrypted.Combined the KDC and NFS server onto a single "node", showing that a KDC can be a client of itself.Some things this entry does not cover and readers may try on their own include:Redundant slave KDCsUsing an existing Kerberos service not provided by a Solaris KDCWork with Microsoft Active DirectoryIntegrate this with a ZFS Storage ApplianceI hope this will be helpful to others!References and NotesUseful LinksManaging Kerberos and Other Authentication Services in Oracle® Solaris 11.2 Working With Oracle® Solaris 11.2 Directory and Naming Services: DNS and NIS IssuesThere is currently (I noticed this in Solaris 11.2 SRU 12, and it exists in earlier releases and SRUs) the situation that kadmin will restart too quickly on Solaris startup and go into maintenance mode. At this time there is no fix. I created a script in /etc/rc3.d to clear the error with svcadm(1M) after about 30 seconds.I address this with a legacy run script started by SMF when going to multiuser-server. If this were a more permanent requirement I would create an SMF service.root@kdc1:/etc/rc3.d# lskadmin.sh README S99kadminroot@kdc1:/etc/rc3.d#root@kdc1:/etc/rc3.d# cat S99kadmin#!/bin/sh/etc/rc3.d/kadmin.sh &root@kdc1:/etc/rc3.d#root@kdc1:/etc/rc3.d# cat kadmin.sh#!/bin/shsleep 30svcadm clear kadminlogger "cleared kadmin"root@kdc1:/etc/rc3.d#

Overview Recently a customer asks for suggestions how to transfer files security, meaning the data is encrypted while on the wire. Several options came to mind. scp IPsec Secure NFS Using scp(1M) seems...

Solaris

Secure NFS: Step O, as in Optional--NTP and DNS

pre { display: block; font-family: monospace; white-space: pre; margin: 1em 0; background: lightgreen;} Optional Network Time Protocol and Domain Name System Setup for KerberosKerberos requires in-sync system time across all systems utilizing the service. Solaris Kerberos also requires direct access to DNS, as it does not use the local name service switch to select host name resolution. Thus I start with the steps to set up NTP and DNS, should you need either or both.NTPSince my setup is using Solaris Zones on a single system, they share the Global Zone's clock, and thus all the Zones' times are in sync. When using Kerberos across multiple systems, it is suggested to keep clock skew at a minimum. You may be doing this already for other reasons. If not, here is a simple Network Time Protocol configuration. Your routers may be valid NTP servers.I add several server references in /etc/inet/ntp.conf, which I base off of the provided /etc/inet/ntp.client file.global# diff /etc/inet/ntp.conf /etc/inet/ntp.client49,53d48< server 0.us.pool.ntp.org iburst< server 1.us.pool.ntp.org iburst< server 2.us.pool.ntp.org iburst< server 3.us.pool.ntp.org iburstglobal#Replace the "x.us.pool.ntp.org" with your NTP servers' IP addresses or hostnames.DNSDNS infrastructure is required for Kerberos. Solaris' Kerberos is compiled to use DNS to do hostname lookups. See Kerberos, DNS, and the Naming Service.If you have DNS servers you can update or even just reference for the nodes you need, please use them. I you don't have that or don't want to use them, here are steps to set up your own DNS service. This will include a single DNS server. More available DNS is out of the scope of this entry.Create the DNS server Solaris ZoneMy Zone configuration file is as follows.global# cat dns.cfgcreate -bset brand=solarisset zonepath=/zones/dnsset autoboot=falseset autoshutdown=shutdownset ip-type=exclusiveadd anetset linkname=net0set lower-link=net1set configure-allowed-address=trueset link-protection=mac-nospoofset mac-address=randomset vlan-id=17endadd anetset linkname=net1set lower-link=net0set configure-allowed-address=trueset link-protection=mac-nospoofset mac-address=randomendadd adminset user=steffenset auths=login,manage,configendglobal#The Zone has two network interfaces. The first (linkname=net0) is on VLAN ID 17 and is for this Secure NFS setup. The second network interface (linkname=net1) ties into my local network, and also my local DNS server (my broadband router at home, or my office network's DNS server--that I can't get modified for my hostnames.)I also set the Zone up so that I can administer it without becoming root, though all the examples here are as root.I configure the zone using the dns.cfg configuration file as input.global# zonecfg -z dns -f dns.cfgUX: /usr/sbin/usermod: steffen is currently logged in, some changes may not take effect until next login.global#Then to speed things up I clone the Zone from a "master" zone I created in advance. On my system a clone takes less than 20 seconds, while an install, with a local IPS repository, takes about 90 seconds. Your times will vary based on your system, type of storage, and the network connection to the IPS repository you use.global# zoneadm -z dns clone -c dns_profile.xml kdcmasterThe following ZFS file system(s) have been created: pool1/zones/dnsProgress being logged to /var/log/zones/zoneadm.20150901T012022Z.dns.cloneLog saved in non-global zone as /zones/dns/root/var/log/zones/zoneadm.20150901T012022Z.dns.cloneglobal#Lets boot the Zone.global# zoneadm -z dns bootglobal#Once the Zone is up and running, I like to create a new boot environment, so that I if have to revert the changes I made, I can just reboot into the existing new Zone. While creating a new Zone is fast, this save some work, and it is also convenient later on to test additional changes.global# zlogin dns[Connected to zone 'dns' pts/8]Oracle CorporationSunOS 5.1111.2July 2015root@dns:~#root@dns:~# beadm create dnsroot@dns:~# beadm activate dnsroot@dns:~# reboot[Connection to zone 'dns' pts/8 closed]global#Install the DNS server in the Solaris ZoneThe DNS server package service/network/dns/bind is not installed by default, so we have to install it. We can verify it is not there by testing for the service.global# zlogin dns[Connected to zone 'dns' pts/8]Oracle CorporationSunOS 5.1111.2July 2015root@dns:~#root@dns:~# svcs *dns*STATE STIME FMRIdisabled 21:26:25 svc:/network/dns/multicast:defaultonline 21:26:29 svc:/network/dns/client:defaultroot@dns:~#root@dns:~# pkg install pkg:/service/network/dns/bind Packages to install: 1 Services to change: 1 Create boot environment: NoCreate backup boot environment: NoDOWNLOAD PKGS FILES XFER (MB) SPEEDCompleted 1/1 38/38 1.4/1.4 9.2M/sPHASE ITEMSInstalling new actions 74/74Updating package state database DoneUpdating package cache 0/0Updating image state DoneCreating fast lookup database DoneUpdating package cache 2/2root@dns:~#root@dns:~# svcs *dns*STATE STIME FMRIdisabled 21:26:25 svc:/network/dns/multicast:defaultdisabled 21:27:17 svc:/network/dns/server:defaultonline 21:26:29 svc:/network/dns/client:defaultroot@dns:~#Configured the DNS serverWith the DNS server package installed, it is time to create a basic DNS server configuration. I am using network 172.17.0.0/22 for some historical reasons. You can adjust to meet your own preferences or local requirements.Some preliminary work for my configuration. My Zone configuration, if you remember, has two networks. The syconfig profile configured net0 for my private network. I still need to configure net1 on my standard network. I will use DHCP to get an address.root@dns:~# dladm show-linkLINK CLASS MTU STATE OVERnet0 vnic 1500 up ?net1 vnic 1500 up ?root@dns:~#root@dns:~# ipadm show-addrADDROBJ TYPE STATE ADDRlo0/v4 static ok 127.0.0.1/8net0/v4 static ok 172.17.0.250/22lo0/v6 static ok ::1/128net0/v6 addrconf ok fe80::8:20ff:fe90:a16e/10root@dns:~#root@dns:~# ipadm create-ip net1root@dns:~#root@dns:~# ipadm create-addr -T dhcp net1net1/v4root@dns:~#root@dns:~# ipadm show-addrADDROBJ TYPE STATE ADDRlo0/v4 static ok 127.0.0.1/8net0/v4 static ok 172.17.0.250/22net1/v4 dhcp ok 192.168.1.112/24lo0/v6 static ok ::1/128net0/v6 addrconf ok fe80::8:20ff:fe90:a16e/10root@dns:~#It is time to create the master DNS file in /etc/named.conf. Some items of note include:My two subnets, 172.17.0.0/22 and 192.168.1.0/24I have ACLs to allow access from my two subnetsI set a forward to my local DNS server (my local router or my office network's DNS servers.)I listen on the two networks listed in the ipadm output above.This is set up for additional slave DNS servers, though I will not be showing the setup of that here.Here is my final /etc/named.conf file.root@dns:~# cat /etc/named.conf//// sample BIND configuration file// taken from http://www.madboa.com/geek/soho-bind///// Added acl per DNS setup at// https://www.digitalocean.com/community/tutorials/how-to-configure-bind-as-a-caching-or-forwarding-dns-server-on-ubuntu-14-04//acl goodclients { 172.17.0.0/22; 192.168.1.0/24; localhost;};options { // tell named where to find files mentioned below directory "/var/named"; // on a multi-homed host, you might want to tell named // to listen for queries only on certain interfaceslisten-on { 127.0.0.1; 172.17.0.250/22; 192.168.1.112/24; }; allow-query { goodclients; };forwarders { 192.168.1.1; };};// The single dot (.) is the root of all DNS namespace, so// this zone tells named where to start looking for any// name on the Internetzone "." IN { // a hint type means that we've got to look elsewhere // for authoritative information type hint; file "named.root";};// Where the localhost hostname is definedzone "localhost" IN { // a master type means that this server needn't look // anywhere else for information; the localhost buck // stops here. type master; file "zone.localhost"; // don't allow dynamic DNS clients to update info // about the localhost zone allow-update { none; };};// Where the 127.0.0.0 network is definedzone "0.0.127.in-addr.arpa" IN { type master; file "revp.127.0.0"; allow-update { none; };};zone "steffentw.com" IN { // this is the authoritative server for // steffentw.com info type master; file "zone.com.steffentw";also-notify { 172.17.0.251; 172.17.0.252; };};zone "0.17.172.in-addr.arpa" { // this is the authoritative server for // the 172.17.0.0/22 network type master; file "revp.172.17.0.0"; also-notify { 172.17.0.251; 172.17.0.252; };};root@dns:~#Now I have to create or update the files pointed to be /etc/named.conf with my local hostnames.root@dns:~# cd /var/namedroot@dns:/var/named# lsnamed.root revp.172.17.0.0 zone.localhostrevp.127.0.0 zone.com.steffentwroot@dns:/var/named#root@dns:/var/named# cat zone.com.steffentw;; dns zone for for steffentw.com;; 20150827Hide _nfsv4idmapdomain to test domainname(1M) response; 20150824Removed CNAME for kdc to see if this is required;$ORIGIN steffentw.com.$TTL 1M; set to 1M for testing, was 1D; any time you make a change to the domain, bump the; "serial" setting below. the format is easy:; YYYYMMDDI, with the I being an iterator in case you; make more than one change during any one day@IN SOA dns hostmaster (201508311 ; serial8H ; refresh4M ; retry1H ; expire1D ; minimum); dns.steffentw.com serves this domain as both the; name server (NS) and mail exchange (MX)NSdnsMX10 dns; define domain functions with CNAMEsdepot CNAME dnswww CNAME dns; for NFSv4 (2015.08.12);_nfsv4idmapdomainIN TXT"steffentw.com"; just in case someone asks for localhost.steffentw.comlocalhostA127.0.0.1;;172.17.0.0/22 Infrastructure Administration Network;host1A172.17.0.101host2A172.17.0.102host3A172.17.0.103host4A172.17.0.104host5A172.17.0.105host6A172.17.0.106host7A172.17.0.107host8A172.17.0.108host9A172.17.0.109zfs1A172.17.0.201zfs2A172.17.0.202zfs3A172.17.0.203dnsA172.17.0.250kdc1A172.17.0.251kdc2A172.17.0.252kdc3A172.17.0.253root@dns:/var/named#root@dns:/var/named# cat revp.172.17.0.0;; reverse pointers for 172.17.0.0 subnet;$ORIGIN 0.16.172.in-addr.arpa.$TTL 1D@IN SOA dns.steffentw.com. hostmaster.steffentw.com. (201508311 ; serial28800 ; refresh (8 hours)14400 ; retry (4 hours)2419200 ; expire (4 weeks)86400 ; minimum (1 day)); define the authoritative name serverNSdns.steffentw.com.;NSdns1.steffentw.com.;NSdns2.steffentw.com.;; 172.17.0.0/22 Infrastructure Administration Network;101PTRhost1.steffentw.com.102PTRhost2.steffentw.com.103PTRhost3.steffentw.com.104PTRhost4.steffentw.com.105PTRhost5.steffentw.com.106PTRhost6.steffentw.com.107PTRhost7.steffentw.com.108PTRhost8.steffentw.com.109PTRhost9.steffentw.com.;201PTRzfs1.steffentw.com.202PTRzfs2.steffentw.com.203PTRzfs3.steffentw.com.;250PTRdns.steffentw.com.251PTRkdc1.steffentw.com.252PTRkdc2.steffentw.com.253PTRkdc3.steffentw.com.root@dns:/var/named#With those files created it is time to enable the DNS server. Keep an eye out on the console of the Zone in case you have errors.root@dns:/var/named# svcs *dns*STATE STIME FMRIdisabled 21:26:25 svc:/network/dns/multicast:defaultdisabled 21:27:17 svc:/network/dns/server:defaultonline 21:26:29 svc:/network/dns/client:defaultroot@dns:/var/named#root@dns:/var/named# svcadm enable dns/serverroot@dns:/var/named#root@dns:/var/named# svcs *dns*STATE STIME FMRIdisabled 21:26:25 svc:/network/dns/multicast:defaultonline 21:26:29 svc:/network/dns/client:defaultonline 21:44:31 svc:/network/dns/server:defaultroot@dns:/var/named#Test the DNS serverLet us see if DNS really works.root@dns:~# getent hosts kdc1172.17.0.251kdc1.steffentw.comroot@dns:~# getent hosts host1172.17.0.101host1.steffentw.comroot@dns:~#A quick test to see if this Zone can do a DNS lookup for an external name.root@dns:~# nslookup www.oracle.comServer:172.17.0.250Address:172.17.0.250#53Non-authoritative answer:www.oracle.comcanonical name = www.oracle.com.edgekey.net.www.oracle.com.edgekey.netcanonical name = e7075.x.akamaiedge.net.Name:e7075.x.akamaiedge.netAddress: 23.66.214.140root@dns:~#root@dns:~# getent hosts www.oracle.com23.66.214.140e7075.x.akamaiedge.net www.oracle.com www.oracle.com.edgekey.netroot@dns:~#Summary and Next StepWith NTP and DNS working, the next step is to build the Key Distribution Server. Either go to KDC setup or back to the introduction.

Optional Network Time Protocol and Domain Name System Setup for Kerberos Kerberos requires in-sync system time across all systems utilizing the service. Solaris Kerberos also requires direct access to...

Solaris

Secure NFS: Step 1--Setting Up the Kerberos KDC

pre { display: block; font-family: monospace; white-space: pre; margin: 1em 0; background: lightgreen;} Kerberos KDCWith DNS set up, the next service to configure is the Key Distribution Center. It will need to access DNS services.Creating the KDC ZoneThe Zone configuration is similar to the DNS server, with the interface using VLAN ID 17 in my setup.global# cat kdc1.cfgcreate -bset brand=solarisset zonepath=/zones/kdc1set autoboot=falseset autoshutdown=shutdownset ip-type=exclusiveadd anetset linkname=net0set lower-link=net1set configure-allowed-address=trueset link-protection=mac-nospoofset mac-address=randomset vlan-id=17endadd adminset user=steffenset auths=login,manage,configendglobal#Since the KDC must use DNS, lets put that into the sysconfig profile.global# more kdc1_profile.xml... <service version="1" type="service" name="network/install"> <instance enabled="true" name="default"> <property_group type="application" name="install_ipv6_interface"> <propval type="astring" name="stateful" value="yes"/> <propval type="astring" name="address_type" value="addrconf"/> <propval type="astring" name="name" value="net0/v6"/> <propval type="astring" name="stateless" value="yes"/> </property_group> <property_group type="application" name="install_ipv4_interface"> <propval type="net_address_v4" name="static_address" value="172.17.0.251 /24"/> <propval type="astring" name="name" value="net0/v4"/> <propval type="astring" name="address_type" value="static"/> </property_group> </instance> </service> <service version="1" type="service" name="network/physical"> <instance enabled="true" name="default"> <property_group type="application" name="netcfg"> <propval type="astring" name="active_ncp" value="DefaultFixed"/> </property_group> </instance> </service> <service version="1" type="service" name="system/name-service/switch"> <property_group type="application" name="config"> <propval type="astring" name="default" value="files"/> <propval type="astring" name="host" value="files dns"/> </property_group> <instance enabled="true" name="default"/> </service> <service version="1" type="service" name="network/dns/client"> <property_group type="application" name="config"> <property type="net_address" name="nameserver"> <net_address_list> <value_node value="172.17.0.250"/> </net_address_list> </property> <property type="astring" name="search"> <astring_list> <value_node value="steffentw.com"/> </astring_list> </property> </property_group> <instance enabled="true" name="default"/> </service> ...global#Configure and clone the KDC Zone.global# zonecfg -z kdc1 -f kdc1.cfgUX: /usr/sbin/usermod: steffen is currently logged in, some changes may not take effect until next login.global#global#global# zoneadm -z kdc1 clone -c kdc1_profile.xml kdcmasterThe following ZFS file system(s) have been created: pool1/zones/kdc1Progress being logged to /var/log/zones/zoneadm.20150901T204046Z.kdc1.cloneLog saved in non-global zone as /zones/kdc1/root/var/log/zones/zoneadm.20150901T204046Z.kdc1.cloneglobal#global# zoneadm -z kdc1 bootglobal#After logging into the KDC Zones, first verify that DNS is configured properly.global#global# zlogin kdc1[Connected to zone 'kdc1' pts/8]Oracle CorporationSunOS 5.1111.2July 2015root@kdc1:~#root@kdc1:~# getent hosts host1172.17.0.101host1.steffentw.comroot@kdc1:~#Installing the Kerberos Server SoftwareThe necessary KDC package is not installed by default.root@kdc1:~# svcs *krb5* ; svcs *kerb*STATE STIME FMRISTATE STIME FMRIdisabled 16:41:20 svc:/system/kerberos/install:defaultroot@kdc1:~#Again I prefer to create an alternate boot environment. This time I will do it as part of the package installation.root@kdc1:~# pkg install --be-name kdc system/security/kerberos-5 Packages to install: 1 Create boot environment: YesCreate backup boot environment: NoDOWNLOAD PKGS FILES XFER (MB) SPEEDCompleted 1/1 41/41 0.7/0.7 27.9M/sPHASE ITEMSInstalling new actions 90/90Updating package state database DoneUpdating package cache 0/0Updating image state DoneCreating fast lookup database DoneUpdating package cache 2/2A clone of solaris-0 exists and has been updated and activated.On the next boot the Boot Environment kdc will bemounted on '/'. Reboot when ready to switch to this updated BE.Updating package cache 2/2root@kdc1:~#A quick check on the BE, and then boot into it.root@kdc1:~# beadm listBE Flags Mountpoint Space Policy Created -- ----- ---------- ----- ------ ------- kdc R - 95.45M static 2015-09-01 16:47solaris-0 N / 6.29M static 2015-09-01 16:40root@kdc1:~#root@kdc1:~# reboot[Connection to zone 'kdc1' pts/8 closed]global#First lets confirm the necessary services are there.global# zlogin kdc1[Connected to zone 'kdc1' pts/8]Oracle CorporationSunOS 5.1111.2July 2015root@kdc1:~#root@kdc1:~# svcs *krb5* ; svcs *kerb*STATE STIME FMRIdisabled 16:48:22 svc:/network/security/krb5_prop:defaultdisabled 16:48:22 svc:/network/security/krb5kdc:defaultSTATE STIME FMRIdisabled 16:48:21 svc:/system/kerberos/install:defaultroot@kdc1:~#Configuring the KDCThe first configuration step is to modify two files. I make a copy for backups and to compare the new to the original here.root@kdc1:~# cd /etc/krb5/root@kdc1:/etc/krb5#root@kdc1:/etc/krb5# cp -p kdc.conf kdc.conf.origroot@kdc1:/etc/krb5# cp -p krb5.conf krb5.conf.origroot@kdc1:/etc/krb5#root@kdc1:/etc/krb5# vi kdc.confroot@kdc1:/etc/krb5#root@kdc1:/etc/krb5# cat kdc.conf### Copyright (c) 2008, Oracle and/or its affiliates. All rights reserved.#[kdcdefaults]kdc_ports = 88,750[realms]___default_realm___ = {profile = /etc/krb5/krb5.confdatabase_name = /var/krb5/principalacl_file = /etc/krb5/kadm5.aclkadmind_port = 749max_life = 8h 0m 0smax_renewable_life = 7d 0h 0m 0sdefault_principal_flags = +preauthmaster_key_type = des3-cbc-sha1-kdsupported_enctypes = des3-cbc-sha1-kd:normal}root@kdc1:/etc/krb5#root@kdc1:/etc/krb5# diff kdc.conf*18,19d17< master_key_type = des3-cbc-sha1-kd< supported_enctypes = des3-cbc-sha1-kd:normalroot@kdc1:/etc/krb5#root@kdc1:/etc/krb5# vi krb5.confroot@kdc1:/etc/krb5#root@kdc1:/etc/krb5# head -20 krb5.conf### Copyright (c) 2007, Oracle and/or its affiliates. All rights reserved.## krb5.conf template# In order to complete this configuration file# you will need to replace the ____ placeholders# with appropriate values for your network and uncomment the# appropriate entries.#[libdefaults]# default_realm = ___default_realm___default_tgs_enctypes = des3-cbc-sha1-kddefault_tkt_enctypes = des3-cbc-sha1-kdpermitted_enctypes = des3-cbc-sha1-kdallow_weak_enctypes = false[realms]root@kdc1:/etc/krb5#root@kdc1:/etc/krb5# diff krb5.conf*14,17d13< default_tgs_enctypes = des3-cbc-sha1-kd< default_tkt_enctypes = des3-cbc-sha1-kd< permitted_enctypes = des3-cbc-sha1-kd< allow_weak_enctypes = false19d14

Kerberos KDC With DNS set up, the next service to configure is the Key Distribution Center. It will need to access DNS services. Creating the KDC Zone The Zone configuration is similar to the DNS server,...

Solaris

Secure NFS: Step 2--First Keberos Client--NFS Server

pre { display: block; font-family: monospace; white-space: pre; margin: 1em 0; background: lightgreen;} Secure NFS ServerWith our Kerberos KDC set up, it is time to build the NFS server. First step is creating another Solaris Zone similar to the previous ones.Creating a NFS Server Zoneglobal# cat zfs1.cfgcreate -bset brand=solarisset zonepath=/zones/zfs1set autoboot=falseset autoshutdown=shutdownset ip-type=exclusiveadd anetset linkname=net0set lower-link=net2set configure-allowed-address=trueset link-protection=mac-nospoofset mac-address=randomset vlan-id=17endadd adminset user=steffenset auths=login,manage,configendglobal#global# zonecfg -z zfs1 -f zfs1.cfgUX: /usr/sbin/usermod: steffen is currently logged in, some changes may not take effect until next login.global#global# zoneadm -z zfs1 clone -c zfs1_profile.xml kdcmasterThe following ZFS file system(s) have been created: pool1/zones/zfs1Progress being logged to /var/log/zones/zoneadm.20150901T210134Z.zfs1.cloneLog saved in non-global zone as /zones/zfs1/root/var/log/zones/zoneadm.20150901T210134Z.zfs1.cloneglobal#global# zoneadm -z zfs1 bootglobal#Configuring the Zone as a Kerberos ClientWe also follow the same steps as for the previous KDC client.global# zlogin zfs1[Connected to zone 'zfs1' pts/10]Oracle CorporationSunOS 5.1111.2July 2015root@zfs1:~#root@zfs1:~# ping kdc1kdc1 is aliveroot@zfs1:~#root@zfs1:~# cat /net/kdc1/share/krb5/kcprofileREALM STEFFENTW.COMKDC kdc1.steffentw.comADMIN kwsFILEPATH /net/kdc1.steffentw.com/share/krb5/krb5.confNFS 1DNSLOOKUP noneroot@zfs1:~#root@zfs1:~# head -5 /net/kdc1.steffentw.com/share/krb5/krb5.conf[libdefaults]default_realm = STEFFENTW.COM[realms]STEFFENTW.COM = {root@zfs1:~#root@zfs1:~# kclient -p /net/kdc1/share/krb5/kcprofileStarting client setup---------------------------------------------------Setting up /etc/krb5/krb5.conf.Copied /net/kdc1.steffentw.com/share/krb5/krb5.conf to /system/volatile/kclient/kclient-krb5conf.MYaafI.Obtaining TGT for kws/admin ...Password for kws/admin@STEFFENTW.COM: enter admin password herekinit: no ktkt_warnd warning possiblenfs/zfs1.steffentw.com entry ADDED to KDC database.nfs/zfs1.steffentw.com entry ADDED to keytab.host/zfs1.steffentw.com entry ADDED to KDC database.host/zfs1.steffentw.com entry ADDED to keytab.---------------------------------------------------Setup COMPLETE.root@zfs1:~#root@zfs1:~# klist -kKeytab name: FILE:/etc/krb5/krb5.keytabKVNO Principal---- -------------------------------------------------------------------------- 2 nfs/zfs1.steffentw.com@STEFFENTW.COM 2 nfs/zfs1.steffentw.com@STEFFENTW.COM 2 nfs/zfs1.steffentw.com@STEFFENTW.COM 2 nfs/zfs1.steffentw.com@STEFFENTW.COM 2 host/zfs1.steffentw.com@STEFFENTW.COM 2 host/zfs1.steffentw.com@STEFFENTW.COM 2 host/zfs1.steffentw.com@STEFFENTW.COM 2 host/zfs1.steffentw.com@STEFFENTW.COMroot@zfs1:~#Configuring the NFS Server File SystemWith the NFS server a Kerberos client, now create a ZFS file system that is exported as an NFS share requiring Kerberos privacy settings (the "krb5p" setting.)root@zfs1:~# zfs create -o mountpoint=/secure -o share.nfs=on -o share.nfs.sec=krb5p \rpool/secureroot@zfs1:~# sharerpool_secure/securenfssec=krb5p,rwroot@zfs1:~#Then create a file with some easily recognized content.root@zfs1:~# echo "The quick brown fox jumps over the lazy dog." > /secure/fox.txtroot@zfs1:~#root@host1:~# cat /secure/fox.txtThe quick brown fox jumps over the lazy dog.root@zfs1:~#Summary and Next StepWith the NFS server running, the next step is to create an NFS client. Either go to NFS Client Setup or back to the introduction.

Secure NFS Server With our Kerberos KDC set up, it is time to build the NFS server. First step is creating another Solaris Zone similar to the previous ones. Creating a NFS Server Zone global# cat...

Solaris

Secure NFS: Step 3--The Secure NFS Client

pre { display: block; font-family: monospace; white-space: pre; margin: 1em 0; background: lightgreen;} Secure NFS ClientWe are getting close to a fully completed configuration. The next item is the client.Build the NFS Client Zone as a KDC Clientglobal# cat host1.cfgcreate -bset brand=solarisset zonepath=/zones/host1set autoboot=falseset autoshutdown=shutdownset ip-type=exclusiveadd anetset linkname=net0set lower-link=net2set configure-allowed-address=trueset link-protection=mac-nospoofset mac-address=randomset vlan-id=17endadd adminset user=steffenset auths=login,manage,configendglobal#global# zoneadm -z host1 clone -c host1_profile.xml kdcmasterThe following ZFS file system(s) have been created: pool1/zones/host1Progress being logged to /var/log/zones/zoneadm.20150901T213207Z.host1.cloneLog saved in non-global zone as /zones/host1/root/var/log/zones/zoneadm.20150901T213207Z.host1.cloneglobal#global# zlogin host1[Connected to zone 'host1' pts/8]Oracle CorporationSunOS 5.1111.2July 2015root@host1:~#root@host1:~# ping kdc1kdc1 is aliveroot@host1:~#root@host1:~# cat /net/kdc1/share/krb5/kcprofileREALM STEFFENTW.COMKDC kdc1.steffentw.comADMIN kwsFILEPATH /net/kdc1.steffentw.com/share/krb5/krb5.confNFS 1DNSLOOKUP noneroot@host1:~#root@host1:~# kclient -p /net/kdc1/share/krb5/kcprofileStarting client setup---------------------------------------------------Setting up /etc/krb5/krb5.conf.Copied /net/kdc1.steffentw.com/share/krb5/krb5.conf to /system/volatile/kclient/kclient-krb5conf.ToaOPV.Obtaining TGT for kws/admin ...Password for kws/admin@STEFFENTW.COM: enter admin password herekinit: no ktkt_warnd warning possiblenfs/host1.steffentw.com entry ADDED to KDC database.nfs/host1.steffentw.com entry ADDED to keytab.host/host1.steffentw.com entry ADDED to KDC database.host/host1.steffentw.com entry ADDED to keytab.---------------------------------------------------Setup COMPLETE.root@host1:~#Demonstrate the NFS Client WorkingThe simples test is to just navigate to the /net/<server name> location.root@host1:~# cat /net/zfs1/secure/fox.txtThe quick brown fox jumps over the lazy dog.root@host1:~#However, was this really an encrypted data transfer? One way to check is with snoop(1M).root@host1:~# snoop -d net0 -r host zfs1 &[1] 21547root@host1:~# Using device net0 (promiscuous mode)root@host1:~# cat /net/zfs1/secure/fox.txtThe quick brown fox jumps over the lazy dog.root@host1:~# 172.17.0.101 -> 172.17.0.201 TCP D=2049 S=1023 Syn Seq=1000276621 Len=0 Win=32804 Options=<mss 1460,sackOK,tstamp 129311831 0,nop,wscale 5>172.17.0.201 -> 172.17.0.101 TCP D=1023 S=2049 Syn Ack=1000276622 Seq=576217546 Len=0 Win=32806 Options=<sackOK,tstamp 129311831 129311831,mss 1460,nop,wscale 5>172.17.0.101 -> 172.17.0.201 TCP D=2049 S=1023 Ack=576217547 Seq=1000276622 Len=0 Win=32806 Options=<nop,nop,tstamp 129311831 129311831>...172.17.0.101 -> 172.17.0.201 RPC RPCSEC_GSS C NFS ver(4) proc(1) (data encrypted)172.17.0.201 -> 172.17.0.101 TCP D=1023 S=2049 Ack=1000276950 Seq=576217547 Len=0 Win=32796 Options=<nop,nop,tstamp 129311831 129311831>172.17.0.201 -> 172.17.0.101 RPC RPCSEC_GSS R NFS ver(4) proc(1) (data encrypted)172.17.0.101 -> 172.17.0.201 TCP D=2049 S=1023 Ack=576217959 Seq=1000276950 Len=0 Win=32806 Options=<nop,nop,tstamp 129311832 129311832>...172.17.0.101 -> 172.17.0.201 RPC RPCSEC_GSS C NFS ver(4) proc(1) (data encrypted)172.17.0.201 -> 172.17.0.101 RPC RPCSEC_GSS R NFS ver(4) proc(1) (data encrypted)...root@host1:~# kill %1root@host1:~#To see the difference, lets create a second share that does not require Kerberos.root@zfs1:~# zfs create -o mountpoint=/clear -o share.nfs=on rpool/clearroot@zfs1:~#root@zfs1:~# sharerpool_secure/securenfssec=krb5p,rwrpool_clear/clearnfssec=sys,rwroot@zfs1:~#root@zfs1:~# cp /secure/fox.txt /clear/root@zfs1:~#And run snoop with the option to dump all the data in each Ethernet frame. I like to use -x 0.First using encrypted mountpoint.root@host1:~# snoop -d net0 -r -x 0 host zfs1 &[1] 21591root@host1:~# Using device net0 (promiscuous mode)root@host1:~# cat /net/zfs1/secure/fox.txtThe quick brown fox jumps over the lazy dog.root@host1:~# 172.17.0.101 -> 172.17.0.201 TCP D=2049 S=48428 Syn Seq=788443968 Len=0 Win=64240 Options=<mss 1460,sackOK,tstamp 129469208 0,nop,wscale 1> 0: 0208 20e4 7813 0208 20ea 4c3d 0800 4500 .. .x... .L=..E. 16: 003c ea59 4000 4006 0000 ac11 0065 ac11 .<.Y@.@......e.. 32: 00c9 bd2c 0801 2efe b340 0000 0000 a002 ...,.....@...... 48: faf0 597f 0000 0204 05b4 0402 080a 07b7 ..Y............. 64: 8b18 0000 0000 0103 0301 ..........172.17.0.201 -> 172.17.0.101 TCP D=48428 S=2049 Syn Ack=788443969 Seq=2268877688 Len=0 Win=32806 Options=<sackOK,tstamp 129469208 129469208,mss 1460,nop,wscale 5> 0: 0208 20ea 4c3d 0208 20e4 7813 0800 4500 .. .L=.. .x...E. 16: 003c f568 4000 4006 ec02 ac11 00c9 ac11 .<.h@.@......... 32: 0065 0801 bd2c 873c 5378 2efe b341 a012 .e...,.<Sx...A.. 48: 8026 c6b9 0000 0402 080a 07b7 8b18 07b7 .&.............. 64: 8b18 0204 05b4 0103 0305 ..........172.17.0.101 -> 172.17.0.201 TCP D=2049 S=48428 Ack=2268877689 Seq=788443969 Len=0 Win=64436 Options=<nop,nop,tstamp 129469208 129469208> 0: 0208 20e4 7813 0208 20ea 4c3d 0800 4500 .. .x... .L=..E. 16: 0034 ea5a 4000 4006 0000 ac11 0065 ac11 .4.Z@.@......e.. 32: 00c9 bd2c 0801 2efe b341 873c 5379 8010 ...,.....A.<Sy.. 48: fbb4 5977 0000 0101 080a 07b7 8b18 07b7 ..Yw............ 64: 8b18 .....172.17.0.101 -> 172.17.0.201 RPC RPCSEC_GSS C NFS ver(4) proc(1) (data encrypted) 0: 0208 20e4 7813 0208 20ea 4c3d 0800 4500 .. .x... .L=..E. 16: 017c ea70 4000 4006 0000 ac11 0065 ac11 .|.p@.@......e.. 32: 00c9 03ff 0801 4667 92c6 2d1f 25fc 8018 ......Fg..-.%... 48: 8026 5abf 0000 0101 080a 07b7 8b1b 07b7 .&Z............. 64: 8b1b 8000 0144 6e7d 0f68 0000 0000 0000 .....Dn}.h...... 80: 0002 0001 86a3 0000 0004 0000 0001 0000 ................ 96: 0006 0000 0018 0000 0001 0000 0000 0000 ................ 112: 0002 0000 0003 0000 0004 1e00 0000 0000 ................ 128: 0006 0000 001c 0404 04ff ffff ffff 0000 ................ 144: 0000 15d8 2a96 8cb9 33d6 91df d5de 4ee1 ....*...3.....N. 160: d51a 0000 00e4 0504 06ff 0000 0000 0000 ................ 176: 0000 15d8 2a97 61c4 fa98 3b63 14d0 c5cb ....*.a...;c.... 192: 59ee 8848 1638 12bc 486e d73a 8b1e d704 Y..H.8..Hn.:.... 208: 74e2 65e6 e036 6847 32e8 d2c8 a100 655b t.e..6hG2.....e[ 224: df06 73df 78d2 af8a 7850 193c a0bc 2147 ..s.x...xP.<..!G 240: 6073 7dcf 3038 cfbb 95d4 5f35 489c 65eb `s}.08...._5H.e. 256: 1e54 3572 60c8 9b1e 78c8 f47a ac25 e8be .T5r`...x..z.%.. 272: ddd5 c104 8067 cf6a ca03 1327 c14d e5dd .....g.j...'.M.. 288: 0f06 2dac bac9 d689 7536 e391 0e3f 14dd ..-.....u6...?.. 304: 2f7b 33d1 231e 3b7b 0de5 5ee2 c28f cb54 /{3.#.;{..^....T 320: a2e0 2456 1ffa ddf0 c37f 42bf 252b 1667 ..$V......B.%+.g 336: 02c2 1fe3 b19d 0d7b 94a2 4e50 748b 5935 .......{..NPt.Y5 352: 890b 746c deb2 5744 97a4 4c07 83e4 5377 ..tl..WD..L...Sw 368: 4ca4 75e4 8081 f196 6f01 63fd 4e56 bee9 L.u.....o.c.NV.. 384: 5510 c21a 6b6a 2d63 c326 U...kj-c.&172.17.0.201 -> 172.17.0.101 RPC RPCSEC_GSS R NFS ver(4) proc(1) (data encrypted) 0: 0208 20ea 4c3d 0208 20e4 7813 0800 4500 .. .L=.. .x...E. 16: 01d0 f57e 4000 4006 ea58 ac11 00c9 ac11 ...~@.@..X...... 32: 0065 0801 03ff 2d1f 25fc 4667 940e 8018 .e....-.%.Fg.... 48: 8026 8344 0000 0101 080a 07b7 8b1b 07b7 .&.D............ 64: 8b1b 8000 0198 6e7d 0f68 0000 0001 0000 ......n}.h...... 80: 0000 0000 0006 0000 001c 0404 05ff ffff ................ 96: ffff 0000 0000 22a9 1433 c781 6e9e 8ed8 ......"..3..n... 112: e6cc aa86 e4d9 0000 0000 0000 0160 0504 .............`.. 128: 07ff 0000 0000 0000 0000 22a9 1434 68c0 .........."..4h. 144: e008 d7e8 cca4 af88 da90 2b45 dc13 57b9 ..........+E..W. 160: 3a0a e3f8 5a98 fddb 5039 62bc 1858 ecd5 :...Z...P9b..X.. 176: 0f5c fcd6 a150 7bf0 0782 d337 8cf6 8de1 .\...P{....7.... 192: 5e81 481f b921 9054 d74a 0160 e9a4 0522 ^.H..!.T.J.`..." 208: 8d85 f55d 9576 f819 6515 c010 8d22 d0a4 ...].v..e....".. 224: e685 0b00 ebd9 cb9b 4079 dcd1 1195 5690 ........@y....V. 240: 9d07 846b a8e0 f022 c33d 7412 5065 3bc5 ...k...".=t.Pe;. 256: 0be5 7f98 9cb5 f5cb 8452 aa0a dfa7 cfb3 .........R...... 272: e9eb a607 03a8 59c9 dc62 903c b289 dd13 ......Y..b.<.... 288: b20f 612d 1603 c335 2705 61ce af13 b792 ..a-...5'.a..... 304: 442e 5a19 59fb d867 377e 34f3 b43d f8e3 D.Z.Y..g7~4..=.. 320: ff0a 2937 d04c 1b22 0213 5227 57f1 ba26 ..)7.L."..R'W..& 336: 44e0 5e52 2f79 41d9 a494 cee6 bd76 f8e0 D.^R/yA......v.. 352: ecd1 4b98 0e91 7b09 321e 97b1 26ef 3cdc ..K...{.2...&.<. 368: 7211 7ae3 b71c 3bb0 c1b0 2e91 93e2 2b37 r.z...;.......+7 384: a1de 76ca f736 70c4 4987 b39f 71e9 736f ..v..6p.I...q.so 400: fc6e 433e 5f2f f283 06b6 cf1b 96f8 b447 .nC>_/.........G 416: af39 1d95 6fe7 4173 e554 2d77 c9b8 df88 .9..o.As.T-w.... 432: 48d2 843e 67cb 54a2 93c8 8bad b24c 1e40 H..>g.T......L.@ 448: 64aa 7f75 5fec a0c6 4d58 de19 ec68 25d3 d..u_...MX...h%. 464: af93 6f26 e12f 180b f0c0 87b6 7df6 ..o&./......}....172.17.0.101 -> 172.17.0.201 NFS R CB_NULL 0: 0208 20e4 7813 0208 20ea 4c3d 0800 4500 .. .x... .L=..E. 16: 0050 ea7c 4000 4006 0000 ac11 0065 ac11 .P.|@.@......e.. 32: 00c9 b385 ed12 c833 5144 9614 5a3c 8018 .......3QD..Z<.. 48: 8026 5993 0000 0101 080a 07b7 8b1d 07b7 .&Y............. 64: 8b1a 8000 0018 627d 0f68 0000 0001 0000 ......b}.h...... 80: 0000 0000 0000 0000 0000 0000 0000 ..............172.17.0.201 -> 172.17.0.101 TCP D=45957 S=60690 Ack=3358806368 Seq=2517916220 Len=0 Win=32806 Options=<nop,nop,tstamp 129469213 129469213> 0: 0208 20ea 4c3d 0208 20e4 7813 0800 4500 .. .L=.. .x...E. 16: 0034 f58a 4000 4006 ebe8 ac11 00c9 ac11 .4..@.@......... 32: 0065 ed12 b385 9614 5a3c c833 5160 8010 .e......Z<.3Q`.. 48: 8026 cd1f 0000 0101 080a 07b7 8b1d 07b7 .&.............. 64: 8b1d ..172.17.0.101 -> 172.17.0.201 TCP D=2049 S=1023 Ack=757019588 Seq=1181196406 Len=0 Win=32806 Options=<nop,nop,tstamp 129469216 129469211> 0: 0208 20e4 7813 0208 20ea 4c3d 0800 4500 .. .x... .L=..E. 16: 0034 ea7d 4000 4006 0000 ac11 0065 ac11 .4.}@.@......e.. 32: 00c9 03ff 0801 4667 a076 2d1f 33c4 8010 ......Fg.v-.3... 48: 8026 5977 0000 0101 080a 07b7 8b20 07b7 .&Yw......... .. 64: 8b1b ..root@host1:~#And now using the clear text mount point.root@host1:~# snoop -d net0 -r -x 0 host zfs1 &[1] 21593root@host1:~# Using device net0 (promiscuous mode)root@host1:~# cat /net/zfs1/clear/fox.txtThe quick brown fox jumps over the lazy dog....172.17.0.201 -> 172.17.0.101 NFS R 4 (read ) NFS4_OK PUTFH NFS4_OK READ NFS4_OK (45 bytes) EOF 0: 0208 20ea 4c3d 0208 20e4 7813 0800 4500 .. .L=.. .x...E. 16: 00b0 f594 4000 4006 eb62 ac11 00c9 ac11 ....@.@..b...... 32: 0065 0801 03ff 2d1f 3ba8 4667 a8d2 8018 .e....-.;.Fg.... 48: 8026 f4c5 0000 0101 080a 07b7 9377 07b7 .&...........w.. 64: 9377 8000 0078 917d 0f68 0000 0001 0000 .w...x.}.h...... 80: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 96: 0000 0000 000c 7265 6164 2020 2020 2020 ......read 112: 2020 0000 0002 0000 0016 0000 0000 0000 .............. 128: 0019 0000 0000 0000 0001 0000 002d 5468 .............-Th 144: 6520 7175 6963 6b20 6272 6f77 6e20 666f e quick brown fo 160: 7820 6a75 6d70 7320 6f76 6572 2074 6865 x jumps over the 176: 206c 617a 7920 646f 672e 0a00 0000 lazy dog........172.17.0.101 -> 172.17.0.201 TCP D=2049 S=1023 Ack=757021992 Seq=1181198770 Len=0 Win=32806 Options=<nop,nop,tstamp 129471358 129471351> 0: 0208 20e4 7813 0208 20ea 4c3d 0800 4500 .. .x... .L=..E. 16: 0034 ea89 4000 4006 0000 ac11 0065 ac11 .4..@.@......e.. 32: 00c9 03ff 0801 4667 a9b2 2d1f 3d28 8010 ......Fg..-.=(.. 48: 8026 5977 0000 0101 080a 07b7 937e 07b7 .&Yw.........~.. 64: 9377 .wroot@host1:~#In both cases, because I let automounter time out and a new mount is initiated in each case, the are so many packets it is hard to know which is doing what. However, in the case of reading the file on /clear the "quick brown fox" text is clearly visibale. Your own tests and snoop output should make this difference very clear.By default, the mounts use NFS version 4 (NFSv4). You can mount stating you want version 3. The results will be the same.Additional NFS Client Configuration Optionsroot@host1:~# mount -o vers=3 zfs1:/secure /mntroot@host1:~#And as a reminder you can force mounts to use version 3 on either a client or a server using the sharectl(1M) command.root@host1:~# sharectl get -p client_versmax nfsclient_versmax=4root@host1:~#root@host1:~# sharectl set -p client_versmax=3 nfsroot@host1:~# sharectl get -p client_versmax nfsclient_versmax=3root@host1:~#Summary and Next StepThis completes the Secure NFS setup. One option is to co-located the KDC and NFS server. Either go to Combining KDC and NFS Server or back to the introduction.

Secure NFS Client We are getting close to a fully completed configuration. The next item is the client. Build the NFS Client Zone as a KDC Client global# cat host1.cfgcreate -bset brand=solarisset...

Solaris

Secure NFS: Step 4--Combining the KDC and NFS Server

pre { display: block; font-family: monospace; white-space: pre; margin: 1em 0; background: lightgreen;} Combining the KDC and NFS ServerWhen I asked my customer about their availability requirements, they stated that they only need a few NFS clients with encrypted traffic. They would like to keep the setup simple, and therefore combine the KDC and NFS server. They are using Oracle Solaris Cluster for availability, and by putting both services in a single Solaris Zone, can meet their availability requirements with Oracle Solaris Cluster managing the Solaris Zone startup and failover.So I looked into whether this is a good idea, and I was informed that this is fully supported and tested. They way to do this is to make the KDC a client of itself.Making the KDC a Kerberos Clientroot@kdc1:~# kclient -p /net/kdc1/share/krb5/kcprofileStarting client setup---------------------------------------------------Setting up /etc/krb5/krb5.conf.Copied /net/kdc1.steffentw.com/share/krb5/krb5.conf to /system/volatile/kclient/kclient-krb5conf.mmayyQ.Obtaining TGT for kws/admin ...Password for kws/admin@STEFFENTW.COM:kinit: no ktkt_warnd warning possiblenfs/kdc1.steffentw.com entry ADDED to KDC database.nfs/kdc1.steffentw.com entry ADDED to keytab.host/kdc1.steffentw.com entry already exists in KDC database.host/kdc1.steffentw.com entry already present in keytab.host/kdc1.steffentw.com entry ADDED to keytab.---------------------------------------------------Setup COMPLETE.root@kdc1:~#root@kdc1:~# klist -kKeytab name: FILE:/etc/krb5/krb5.keytabKVNO Principal---- -------------------------------------------------------------------------- 3 host/kdc1.steffentw.com@STEFFENTW.COM 3 host/kdc1.steffentw.com@STEFFENTW.COM 3 host/kdc1.steffentw.com@STEFFENTW.COM 3 host/kdc1.steffentw.com@STEFFENTW.COM 2 nfs/kdc1.steffentw.com@STEFFENTW.COM 2 nfs/kdc1.steffentw.com@STEFFENTW.COM 2 nfs/kdc1.steffentw.com@STEFFENTW.COM 2 nfs/kdc1.steffentw.com@STEFFENTW.COMroot@kdc1:~#Creating Secured NFS ShareThen create a new mount point and put some data into it.root@kdc1:~# zfs create -o mountpoint=/secure -o share.nfs=on -o share.nfs.sec=krb5p rpool/secureroot@kdc1:~#root@kdc1:~# sharerpool_share /share nfs sec=sys,rw rpool_secure /secure nfs sec=krb5p,rw root@kdc1:~#root@kdc1:~# cp /net/zfs1/secure/fox.txt /secure/root@kdc1:~#root@kdc1:~# cat /secure/fox.txtThe quick brown fox jumps over the lazy dog.root@kdc1:~#Back on the client, read the file on the KDC, with snoop running to show data is encrypted. And since the maximum client version was set to version 3, the snoop shows that as well.root@host1:~# snoop -d net0 -r host kdc1 &[1] 21825root@host1:~# Using device net0 (promiscuous mode)root@host1:~# cat /net/kdc1/secure/fox.txtThe quick brown fox jumps over the lazy dog.root@host1:~# 172.17.0.101 -> 172.17.0.251 TCP D=2049 S=1022 Syn Seq=597683294 Len=0 Win=32804 Options=<mss 1460,sackOK,tstamp 129789256 0,nop,wscale 5>172.17.0.251 -> 172.17.0.101 TCP D=1022 S=2049 Syn Ack=597683295 Seq=1916087307 Len=0 Win=32806 Options=<sackOK,tstamp 129789256 129789256,mss 1460,nop,wscale 5>172.17.0.101 -> 172.17.0.251 TCP D=2049 S=1022 Ack=1916087308 Seq=597683295 Len=0 Win=32806 Options=<nop,nop,tstamp 129789256 129789256>172.17.0.101 -> 172.17.0.251 RPC RPCSEC_GSS C NFS ver(3) proc(1) (data encrypted)172.17.0.251 -> 172.17.0.101 TCP D=1022 S=2049 Ack=597683495 Seq=1916087308 Len=0 Win=32806 Options=<nop,nop,tstamp 129789257 129789257>172.17.0.251 -> 172.17.0.101 RPC RPCSEC_GSS R NFS ver(3) proc(1) (data encrypted)172.17.0.101 -> 172.17.0.251 TCP D=2049 S=1022 Ack=1916087520 Seq=597683495 Len=0 Win=32806 Options=<nop,nop,tstamp 129789259 129789259>172.17.0.101 -> 172.17.0.251 RPC RPCSEC_GSS C NFS ver(3) proc(4) (data encrypted)172.17.0.251 -> 172.17.0.101 TCP D=1022 S=2049 Ack=597683699 Seq=1916087520 Len=0 Win=32806 Options=<nop,nop,tstamp 129789259 129789259>172.17.0.251 -> 172.17.0.101 RPC RPCSEC_GSS R NFS ver(3) proc(4) (data encrypted)172.17.0.101 -> 172.17.0.251 TCP D=2049 S=1022 Ack=1916087740 Seq=597683699 Len=0 Win=32806 Options=<nop,nop,tstamp 129789259 129789259>172.17.0.101 -> 172.17.0.251 RPC RPCSEC_GSS C NFS ver(3) proc(1) (data encrypted)172.17.0.251 -> 172.17.0.101 RPC RPCSEC_GSS R NFS ver(3) proc(1) (data encrypted)172.17.0.101 -> 172.17.0.251 TCP D=2049 S=1022 Ack=1916087952 Seq=597683899 Len=0 Win=32806 Options=<nop,nop,tstamp 129789266 129789259>root@host1:~#Summary and Next StepThat is everything, I hope. Here you can quickly go back to the introduction.

Combining the KDC and NFS Server When I asked my customer about their availability requirements, they stated that they only need a few NFS clients with encrypted traffic. They would like to keep the...

Solaris Networking

Solaris 11 Express Network Tunables

OverviewFor years I, and many others, have been tuning TCP, UDP, IP, and other aspects of the Solaris network stack with ndd(1M). The ndd command is documented, however, most of the tunables were really private interface implementations, subject to change, and lacked documentation in many cases. Also, ndd does not show the default values, nor the possible values or ranges..That is changing with Solaris 11 Express. A new command ipadm(1M) allows persistent and temporary (with the -t option) setting of key tunable values. This is a major improvement over ndd, where it is customary to create an /etc/rc2.d/S69ndd or similar script to set the parameter on every reboot. Another benefit is that ipadm shows the default value and the values that the property can be set to. The ipadm has many features to configure the IP settings of interfaces. This blog entry focuses on how ipadm replaces ndd. Note that ipadm only supports the IP, TCP, UDP, SCTP, and ICMP protocols. Other protocols such as ipsecah and keysock still required the use of ndd.Review of nddTo get a list of all tunables for a specific protocol, an ndd -get operation is performed with "?" as the argument. For example, this is a way of listing all the TCP parameters.root@Solaris11Express# ndd -get /dev/tcp \?tcp_time_wait_interval (read and write)tcp_conn_req_max_q (read and write)tcp_conn_req_max_q0 (read and write)tcp_conn_req_min (read and write)...tcp_dev_flow_ctl (read and write)tcp_reass_timeout (read and write)tcp_extra_priv_ports_add (write only)tcp_extra_priv_ports_del (write only)tcp_extra_priv_ports (read only)tcp_1948_phrase (write only)tcp_listener_limit_conf (read only)tcp_listener_limit_conf_add (write only)tcp_listener_limit_conf_del (write only)To get the current value of specific parameter, list the parameter as the argument for the driver, in this case /dev/tcp.root@Solaris11Express# ndd -get /dev/tcp tcp_conn_req_max_q128And to set parameter, follow it with a value.root@Solaris11Express# ndd -set /dev/tcp tcp_conn_req_max_q 256root@Solaris11Express# ndd -get /dev/tcp tcp_conn_req_max_q256And for my own benefit, I set it back to the original.root@Solaris11Express# ndd -set /dev/tcp tcp_conn_req_max_q 128root@Solaris11Express# ndd -get /dev/tcp tcp_conn_req_max_q128Using the ipadm *-prop OptionsThe ipadm(1M) manual page lists three sub-commands to manage TCP/IP protocol properties. ipadm set-prop [-t] -p prop=value[,...] protocol ipadm reset-prop [-t] -p prop protocol ipadm show-prop [[-c] -o field[,...]] [-p prop[,...]] [protocol]To list all the properties for all the protocols as currently supported, I run ipadm with the show-prop sub-command.root@Solaris11Express# ipadm show-propPROTO PROPERTY PERM CURRENT PERSISTENT DEFAULT POSSIBLEipv4 forwarding rw off -- off on,offipv4 ttl rw 255 -- 255 1-255ipv6 forwarding rw off -- off on,offipv6 hoplimit rw 255 -- 255 1-255ipv6 hostmodel rw weak -- weak strong, src-priority, weakipv4 hostmodel rw weak -- weak strong, src-priority, weakicmp recv_maxbuf rw 8192 -- 8192 4096-65536icmp send_maxbuf rw 8192 -- 8192 4096-65536tcp ecn rw passive -- passive never,passive, activetcp extra_priv_ports rw 2049,4045 -- 2049,4045 1-65535tcp largest_anon_port rw 65535 -- 65535 1024-65535tcp recv_maxbuf rw 128000 -- 128000 2048-1073741824tcp sack rw active -- active never,passive, activetcp send_maxbuf rw 49152 -- 49152 4096-1073741824tcp smallest_anon_port rw 32768 -- 32768 1024-65535tcp smallest_nonpriv_port rw 1024 -- 1024 1024-32768udp extra_priv_ports rw 2049,4045 -- 2049,4045 1-65535udp largest_anon_port rw 65535 -- 65535 1024-65535udp recv_maxbuf rw 57344 -- 57344 128-1073741824udp send_maxbuf rw 57344 -- 57344 1024-1073741824udp smallest_anon_port rw 32768 -- 32768 1024-65535udp smallest_nonpriv_port rw 1024 -- 1024 1024-32768sctp extra_priv_ports rw 2049,4045 -- 2049,4045 1-65535sctp largest_anon_port rw 65535 -- 65535 1024-65535sctp recv_maxbuf rw 102400 -- 102400 8192-1073741824sctp send_maxbuf rw 102400 -- 102400 8192-1073741824sctp smallest_anon_port rw 32768 -- 32768 1024-65535sctp smallest_nonpriv_port rw 1024 -- 1024 1024-32768The first column lists the protocols. Of note is that there are separate IPv4 and IPv6 listings. Per the specification, there is no ttl for IPv6, as is seen by only an IPv4 property. IPv6 calls it the hoplimit, which is more indicative of how the value is actually used.Including a protocol as an argument lists only those properties.root@Solaris11Express# ipadm show-prop tcpPROTO PROPERTY PERM CURRENT PERSISTENT DEFAULT POSSIBLEtcp ecn rw passive -- passive never,passive, activetcp extra_priv_ports rw 2049,4045 -- 2049,4045 1-65535tcp largest_anon_port rw 65535 -- 65535 1024-65535tcp recv_maxbuf rw 128000 -- 128000 2048-1073741824tcp sack rw active -- active never,passive, activetcp send_maxbuf rw 49152 -- 49152 4096-1073741824tcp smallest_anon_port rw 32768 -- 32768 1024-65535tcp smallest_nonpriv_port rw 1024 -- 1024 1024-32768We see the current value, whether we can set it, its default value, and the possible values or range of values. Self documenting. I like it!To get a specific property, the -p option specifies which one to list.root@Solaris11Express# ipadm show-prop -p send_maxbuf tcpPROTO PROPERTY PERM CURRENT PERSISTENT DEFAULT POSSIBLEtcp send_maxbuf rw 49152 -- 49152 4096-1073741824Now to set a property to a specific value, use the format property=value.root@Solaris11Express# ipadm set-prop -p send_maxbuf=4096 tcproot@Solaris11Express# ipadm show-prop -p send_maxbuf tcpPROTO PROPERTY PERM CURRENT PERSISTENT DEFAULT POSSIBLEtcp send_maxbuf rw 4096 4096 49152 4096-1073741824The value of 4096 in the PERSISTENT column indicates this setting will be retained even after a reboot. To set the property only until the next reboot, use the -t option to set it temporarily.root@Solaris11Express# ipadm set-prop -t -p send_maxbuf=4096 tcproot@Solaris11Express# ipadm show-prop -p send_maxbuf tcpPROTO PROPERTY PERM CURRENT PERSISTENT DEFAULT POSSIBLEtcp send_maxbuf rw 4096 -- 49152 4096-1073741824While it certainly possible to set the value of property back to the same one that is the default, I like the option to set it to its default. This is done with a reset. The PERSISTENT column has reverted back to its original --.root@Solaris11Express# ipadm reset-prop -p send_maxbuf tcproot@Solaris11Express# ipadm show-prop -p send_maxbuf tcpPROTO PROPERTY PERM CURRENT PERSISTENT DEFAULT POSSIBLEtcp send_maxbuf rw 49152 -- 49152 4096-1073741824What About All Those Other ndd Configuration Parameters?The output of the show-prop operation above is very small compared to what those who use ndd are used to for even just one of the protocols. So what about all the other ndd parameters?There are two options:Continue to use nddUse a special parameter conversion of the ndd parameter with ipadmThe first is business as usual. The second involves converting the protocol's ndd parameter into one that works with ipadm. The steps that have worked for me are as follows.For any parameter, replace the /dev/protocol and use the protocol as the protocol argument to ipadm. So /dev/tcp becomes tcp. Drop the leading protocol name from the beginning of the parameter, if there is one. So tcp_local_dack_interval becomes _local_dack_interval. If there is no leading procotol name, prepend the property with an underscore (_). For example, tcp_local_dack_interval becomes _tcp_local_dack_interval. For the IP protocol, if there are IPv4 and IPv6 ndd values, indicate the ipadm protocol as ipv4 and ipv6, respectively. With ndd, the lack of a 6 means IPv4. Examples of each are as follows.Dropping the leading protocol name and specifying it for the protocol argument.root@Solaris11Express# ndd -get /dev/tcp tcp_local_dack_interval50root@Solaris11Express# ipadm show-prop -p _local_dack_interval tcpPROTO PROPERTY PERM CURRENT PERSISTENT DEFAULT POSSIBLEtcp _local_dack_interval rw 50 -- 50 10-500Getting a parameter that does not start with the protocol.root@Solaris11Express# ndd -get /dev/ip arp_probe_interval1500root@Solaris11Express# ipadm show-prop -p _arp_probe_interval ipPROTO PROPERTY PERM CURRENT PERSISTENT DEFAULT POSSIBLEip _arp_probe_interval rw 1500 -- 1500 10-20000Distinguishing between IPv4 and IPv6 parameters.root@Solaris11Express# ndd -get /dev/ip ip_strict_dst_multihoming0root@Solaris11Express# ndd -get /dev/ip ip6_strict_dst_multihoming0root@Solaris11Express# ipadm show-prop -p _strict_dst_multihoming ipv4PROTO PROPERTY PERM CURRENT PERSISTENT DEFAULT POSSIBLEipv4 _strict_dst_multihoming rw 0 -- 0 0-1root@Solaris11Express# ipadm show-prop -p _strict_dst_multihoming ipv6PROTO PROPERTY PERM CURRENT PERSISTENT DEFAULT POSSIBLEipv6 _strict_dst_multihoming rw 0 -- 0 0-1And when there is an error. All the fields have ? in them.root@Solaris11Express# ipadm show-prop -p _strict_dst_multihoming ipPROTO PROPERTY PERM CURRENT PERSISTENT DEFAULT POSSIBLEipadm: warning: cannot get property '_strict_dst_multihoming' for 'ip'Unknown propertyip _strict_dst_multihoming ? ? ? ? ?As more properties are added to ipdam to manage there directly, it will become less necessary to do the ndd work-around.

Overview For years I, and many others, have been tuning TCP, UDP, IP, and other aspects of the Solaris network stack with ndd(1M). The ndd command is documented, however, most of the tunables...

Solaris

ZFS zpool and file system version numbers and features

Often enough I have had to check the version of a ZFS pool or file system version. Sometimes, I am curious where a specific feature was delivered. So I imagine this could be useful for others. (Updated 21 Feb 2012 for Solaris 10 8/11 and Solaris 11.)One note is that ZFS versions are backward compatible, which means that a kernel with a newer version can import an older version. The reverse is not true. So it is important to know what the oldest kernel version you might want to attach a pool to is, and make sure you don't upgrade your pool or file system to something newer. This table may help with that as well.Note: This table is sorted by pool version, then file system version. The availability dates of the releases are not chronological, as a feature delivered in a version of Solaris 11 may be delivered in later Solaris 10 update.delivered inzpool versionzfs versionfeaturescommentsSolaris 11 11/11335EncryptionLabel support for Trusted ExtensionsSolaris 11 Express 2010.11315deduplicationdiff for snapshotsread-only pool importpool import with missing log deviceSolaris 10 8/11295ZFS installation with Flash Archives (not really a ZFS feature)ZFS send will include file system propertiesZFS diffPool import with missing log devicePool import as read-onlySynchronous writesACL improvementsImprovements in pool messagesSolaris 10 9/10224triple parity RAID-Z (raidz3)logbias propertypool recoverymirror splittingdevice replacement enhancementsZFS system processSolaris 10 10/09103ZFS with flash installationuser and group quotasZFS cache devices (L2ARC)set ZFS properties at file system creationprimarycache and secondarycache propertieslog device recoverySolaris 10 5/09103zone clone creates ZFS cloneSolaris 10 10/08103separate ZIL log devicesZFS boot/root file systemzone on ZFSrecursive snapshot renamingsnapshot rollback improvementssnapshot send improvementsgzip compressionmultiple user data copiesquotas and reservations can exclude snapshots/clonesfailure mode optionsZFS upgrade optiondelegated administrationIn Solaris 10 10/08 and later, zpool and zfs have the version option. It shows the version of the pool or file system, even if it is an older ZFS pool.Solaris 10 5/0841Pool version determined using zdb(1M) on Solaris 10 5/08Solaris 10 8/0741iSCSI supportzpool historyability to set properties when creating file systemPool version determined using zdb(1M) on Solaris 10 8/07Solaris 10 11/0631recursive snapshotsdouble parity RAID-Z (raidz2)clone promotionPool version determined using zdb(1M) on Solaris 10 11/06Solaris 10 6/0621pool upgraderestore of destroyed poolintegration into Solaris FMAfile system monitoring (fsstat)Initial release of ZFS in Solaris 10Pool version determined using zdb(1M) on Solaris 10 6/06The details of all the ZFS features introduced in the Solaris 10 updates are listed in Chapter 1 of the ZFS Administration Guide and for Solaris 11 Express in its ZFS Administration Guide. Hope this helps!Steffen

Often enough I have had to check the version of a ZFS pool or file system version. Sometimes, I am curious where a specific feature was delivered. So I imagine this could be useful for...

Solaris Networking

Why Are Packets Going Out The Wrong Interface--Preserving For Historical Reasons

I had previously referenced James Carlson's blog. Because the information is useful, and James is no longer with the company and able to update or preserve it, I am copying his posting here. Thanks again, James, for all the information regarding networking, and specifically Solaris networking, over the years!!SteffenDated: Thursday Apr 30, 2009The ProblemA common complaint for Solaris users runs something like this: I have a Solaris system with two Ethernet interfaces connected to different subnets. Sometimes, I see an IP packet come in on one interface, but the packet goes back out a different one. This behavior is bad for my network, because I have firewalls that check the packet sources, and they drop these misdirected packets. Why does Solaris do this? And how can I fix it? I've tried disabling routing, but that doesn't seem to help. Problems like this when reported are usually closed out as "will not fix," as for example CR 4085133.The WhyThe underlying problem here is at least partly a misunderstanding of how TCP/IP works. When a system transmits a packet, it must locate the "best" interface over which to send it. By default, the algorithm for doing that is as described in RFC 1122 section 3.3.1. Note in particular section 3.3.1.1. This requires the system to look at local interfaces first -- all of them -- to try to match the destination address. And once we find the interface by the destination address, we're done.That alone is enough to make things not work as expected. If you send a packet to the local address on ce0 from some other system, but that other system is best reachable through bge0, then we'll send the reply via bge0. It doesn't go back out through ce0, even if the original request came in that way.When considering a non-interface route (whether only the "default routes" of RFC 1122 or the more flexible CIDR routes of RFC 1812), the system will look up the route by destination IP address alone, and then use the route to obtain the output interface. This often causes the same sort of confusion when a "default route" ends up causing packets to go to the default router that the administrator thinks don't belong there.I actually consider this a design feature of TCP/IP, and not a flaw. It's part of the robustness that IP's datagram routing system offers: every node in the network -- hosts and routers alike -- independently determines the best way to send each distinct datagram based solely on the destination IP address. This allows for "healing" of broken networks, as the failure of one interface or router means that you can potentially still use a different (perhaps less preferred) one to send your message.There are some related bits of confusion in this area. For example, some programmers think that binding to a particular IP address means that the interface with that address is "bound" and all packets will go out that way. That's not correct. The system still uses the destination address to pick the output path for each individual IP packet, even if your socket is bound to an address on some particular interface. And, as long as you don't set the ip_strict_dst_multihoming ndd flag (it's not set by default), binding to an address doesn't mean that packets will only arrive on that corresponding interface. They can arrive on any interface in the system, as long as the IP address matches the one bound.The SolutionsThere are many ways to fix this issue, and the right answer for a given situation likely depends on the details of that situation. The main issue here is the kernel's forwarding table, so putting the right things into the forwarding table is one of the first tasks.A common problem is that the administrator has set up a "default router," but that specified router cannot correctly forward to all possible IP destinations. Some packets the system sends end up getting misdirected or lost as a result. The solution is not having that router as a "default router," and instead using more specific routes (perhaps running a listen-only routing protocol to simplify the administrative burden). Some systems have a "route by source address" feature. Solaris isn't one of those, though there is an RFE open on it (see CR 4777670). A better answer, in my opinion, would be to do something similar to what's suggested in CR 4173841. That would be, when we have multiple matching routes, to prefer a route that gives us an output interface in the same subnet as the source address.It's a simple tweak, and would at least fix the folks who have problems default route selection. It would not fix the problems people with interfaces on separate subnets have, though. Applications that care about interface selection can use IP_BOUND_IF or IP_PKTINFO to select the specific interface desired.See the ip(7P) man page on your system for details. If all else fails, you can use IP Filter's fastroute/to keyword on an output interface to put packets right where you want them. You should be aware that when you do this, you're circumventing IP's routing features, which means that if there's an interface or path failure, you may cause connections to fail that didn't need to fail.

I had previously referenced James Carlson's blog. Because the information is useful, and James is no longer with the company and able to update or preserve it, I am copying his posting here. Thanks...

Solaris

Getting GDM to work on text Solaris 11 Express 2010.11 installs

One of the features of Solaris 11 Express is to install into a ZFS pool, which allows updates to be easily managed using ZFS snapshots and clones. The LiveCD install, however, does not offer the option to save space for another ZFS pool. I prefer to have a separate pool for data, even on my single-disk laptop. The only way to do that as I can tell is to install using the text installer. One side effect of the test installer is that it does not install everything necessary to run a GUI desktop, which is very handy on a laptop.Thanks to some replies to an internal question I posted, there is a relatively easy way to add the necessary packages to allow GDM and related tools to work. I have used them several times, and this writeup describes them.The initial text based install put 494 packages on the system.Solaris 11 Express 2010.11# pkg list | wc -l495Solaris 11 Express 2010.11# pkg list | headNAME (PUBLISHER) VERSION STATE UFOXISUNWcs 0.5.11-0.151.0.1 installed -----SUNWcsd 0.5.11-0.151.0.1 installed -----archiver/gnu-tar 1.23-0.151.0.1 installed -----compress/bzip2 1.0.6-0.151.0.1 installed -----compress/gzip 1.3.5-0.151.0.1 installed -----compress/p7zip 4.55-0.151.0.1 installed -----compress/unzip 5.53.7-0.151.0.1 installed -----compress/zip 2.32-0.151.0.1 installed -----consolidation/SunVTS/SunVTS-incorporation 0.5.11-0.151.0.1 installed -----To add the required packages to the system, the slim_install package has to be added. This adds an additional 390 packages to the system.Solaris 11 Express 2010.11# pkg install slim_install Packages to install: 390 Create boot environment: No Services to restart: 10DOWNLOAD PKGS FILES XFER (MB)Completed 390/390 42204/42204 410.5/410.5PHASE ACTIONSInstall Phase 67952/67952PHASE ITEMSPackage State Update Phase 390/390Image State Update Phase 2/2After this, I did a reboot, just to make sure. Then I uninstalled the slim_install package, which removed only that one. The other 389 packages must have been dependencies of slim_install.Solaris 11 Express 2010.11# pkg uninstall slim_install Packages to remove: 1 Create boot environment: NoPHASE ACTIONSRemoval Phase 828/828PHASE ITEMSPackage State Update Phase 1/1Package Cache Update Phase 1/1Image State Update Phase 2/2Once I enable GDM, the screen show action and shortly I have the familiar GUI login prompt.Solaris 11 Express 2010.11# svcs gdmSTATE STIME FMRIdisabled 12:26:40 svc:/application/graphical-login/gdm:defaultSolaris 11 Express 2010.11# svcadm enable gdmSolaris 11 Express 2010.11# svcs gdmSTATE STIME FMRIonline 12:38:11 svc:/application/graphical-login/gdm:defaultI hope this helps others. I certainly know where to look when I have to do this again!Steffen[Updated 2010.11.23]First, I'd like to acknowledge Keith Mitchell who provided me with the suggestion to do the install and uninstall of the slim_install package.Second, in the process of checking in with Keith, he suggested taking care when doing the above operations while logged in on the console. If you leave yourself logged in at the console when GDM starts, there are small possibilities of certain devices not being configured properly when logging into gnome, due to how logindevperm works. Suggestions include:svcadm enable gdm && exitorsvcadm enable gdm; exitI did this remotely, at least the most recent time, to capture the output for this blog. I did not notice any effects when I had done this the first time on a different system, however, I might have reboot at that point anyway.Thanks again to Keith for his tips!

One of the features of Solaris 11 Express is to install into a ZFS pool, which allows updates to be easily managed using ZFS snapshots and clones. The LiveCD install, however, does not offer the...

Solaris

New privilege added to the 'basic' Least Privilege set

Oracle Solaris 10 9/10 (update 9) has added another privilege to the basic set of privileges, the set that all unprivileged (non-root) users have by default.With Least Privileges, a non-root process by default has the ability to get process information, create and delete files, fork and exec, and now separately open TCP or UDP end points. The ppriv(1) command prints the list of privileges.Solaris 10 9/10# ppriv -l basicfile_link_anyproc_execproc_forkproc_infoproc_sessionnet_accessA verbose listing includes basic descriptions, which are also described in privileges(5).Solaris 10 9/10# ppriv -lv basicfile_link_any Allows a process to create hardlinks to files owned by a uid different from the process' effective uid.proc_exec Allows a process to call execve().proc_fork Allows a process to call fork1()/forkall()/vfork()proc_info Allows a process to examine the status of processes other than those it can send signals to. Processes which cannot be examined cannot be seen in /proc and appear not to exist.proc_session Allows a process to send signals or trace processes outside its session.net_access Allows a process to open a TCP or UDP network endpoint.With the addition of the net_access privilege, it is now possible to prevent a process from creating sockets and network end points, isolating the process from the network. By default, processes have this privilege, so any action would be to remove it.To demonstrate this I am using the ppriv command to limit the privilege of a command and see with the debug flag what is happening.Even as an unprivileged user I can see if a specific IP address is in use with the ping command. So lets see what happens when I don't have the net_access privilege. I am doing this as a basic user.Solaris 10 9/10$ ppriv -D -s I-net_access -e /usr/sbin/ping 172.16.1.1ping[14942]: missing privilege "net_access" (euid = 1001, syscall = 5) for "devpolicy" needed at spec_open+0xd0ping[14942]: missing privilege "net_access" (euid = 1001, syscall = 5) for "devpolicy" needed at spec_open+0xd0ping[14942]: missing privilege "net_access" (euid = 1001, syscall = 5) for "devpolicy" needed at spec_open+0xd0/usr/sbin/ping: unknown host 172.16.1.1Since I am forking a process with the -e option, I limit the I (inherited) privilege set with the net_access removed. The debug output shows that its net_access that is missing, and it happens three time.To see how it would look with the privilege, I run the same command with the basic set inherited.Solaris 10 9/10$ ppriv -D -s I=basic -e /usr/sbin/ping 172.16.1.1172.16.1.1 is alive Everything worked, and no debug output.Its a good idea to use predefined sets such as basic, so that changes in the set don't affects script in the future.Steffen

Oracle Solaris 10 9/10 (update 9) has added another privilege to the basic set of privileges, the set that all unprivileged (non-root) users have by default. With Least Privileges, a non-root process...

Solaris Networking

TCP Fusion and improved loopback traffic

In the past, when two processes were communicating using TCP on the same system, a lot of the TCP and IP protocol processing was performed just as it was for traffic to and from another system. Since a significant amount of CPU is spent on the protocol layers both sending and receiving to insure successful, complete, in order, non-duplicated, re-routed around network failures for data that never left the system, there is considerable performance benefit in providing a short circuit for the data.In Solaris 10 6/06 a feature called TCP Fusion was delivered, which removed all the stack processing when both ends of the TCP connection are in the same system, and now with IP Instances, in the same IP Instance (between the global zone and all shared IP zones, or within an exclusive zone). There are some exceptions to this, including when using IPsec, IPQoS, raw-socket, kernel SSL, non-simple TCP/IP conditions. or the two end points are on different squeues. A fused connect will revert to unfused if an IP Filter rule will drop a packet. However TCP fusion is done in the general case.So why do I bring this up? With TCP fusion enabled (which it is by default in Solaris 10 6/06 and later, and in OpenSolaris), when a TCP connection is created between processes on a system, the necessary things are set up to transfer data from the sender to the receiver without sending it down and back up the stack. The typical flow control of filling a send buffer (defaults to 48K or the value of tcp_xmit_hiwat, unless changed via a socket operation) still applies. With TCP Fusion on, there is a second check, which is the number of writes to the socket without a read. The reason for the counter is to allow the receiver to get CPU cycles, since the sender and receiver are on the same system and may be sharing one or more CPUs. The default value of this counter is eight (8), as determined by tcp_fusion_rcv_unread_min. The value per TCP connection is calculated asMAX(sndbuf >> 14, tcp_fusion_rcv_unread_min);Some details of the reasoning and implementation are in Change Request 4821256.When doing large writes, or when the receiver is actively reading, the buffer flow control dominates. However, when doing smaller writes, it is easy for the sender to end up with a condition where the number of consecutive writes without a read is exceeded, and the writer blocks, or if using non-blocking I/O, will get an EAGAIN error.The latter was a case at a customer of mine. An ISV application was reporting EAGAIN errors on a new installation, something that hadn't been seen before. More importantly, the ISV was also not seeing it elsewhere or in their test environment.After some investigation using DTrace, including reproduction on slightly different system configuration, it became clear that the sending application was getting the error after a burst of writes. The application has both local and remote (on other systems) receivers, and the EAGAIN errors were only happening on the local connection.I also saw that the application was repeatedly doing a pair of writes, one of 12 bytes and the second of 696 bytes. Thus it would be easy to hit the consecutive write counter before the write buffer is ever filled.To test this I suggested the customer change the tcp_fusion_rcv_unread_min on their running system using mdb(1). I suggested they increase the counter by a factor of four (4), just to be safe.# echo "tcp_fusion_rcv_unread_min/W 32" | mdb -kwtcp_fusion_rcv_unread_min: 0x8 = 0x20Here is how you check what the current value is.# echo "tcp_fusion_rcv_unread_min/D" | mdb -ktcp_fusion_rcv_unread_min:tcp_fusion_rcv_unread_min: 32After running several hours of tests, the EAGAIN error did not return.Since then I have suggested they set tcp_fusion_rcv_unread_min to 0, to turn the check off completely. This will allow the buffer size and total outstanding write data volume to determine whether the sender is blocked, as it is for remote connections. Since the mdb is only good until the next reboot, I suggested the customer change the setting in /etc/system.\* Set TCP fusion to allow unlimited outstanding writes up to the TCP send buffer set by default or the application.\* The default value is 8.set ip:tcp_fusion_rcv_unread_min=0To turn TCP Fusion off all together, something I have not tested with, the variable do_tcp_fusion can be set from its default 1 to 0.I hope this helps someone who might be trying to understand why errors, or maybe less than expected throughput, is being seen on local connections.And I would like to note that in OpenSolaris only the do_tcp_fusion setting is available. With the delivery of CR 6826274, the consecutive write counting has been removed. The TCP Fusion code has also been moved into its own fileThanks to Jim Eggers, Jim Fiori, Jim Mauro, Anders Parsson, and Neil Putnam for their help as I was tracking all this stuff down!SteffenPS. After publishing, I wrote this DTrace script to show what the per connection outstanding write counter tcp_fuse_rcv_unread_hiwater is set to.# more tcp-fuse.d#!/usr/sbin/dtrace -qsfbt:ip:tcp_fuse_maxpsz_set:entry{ self->tcp = (tcp_t \*) arg0;}fbt:ip:tcp_fuse_maxpsz_set:return/self->tcp > 0/{ this->peer = (tcp_t \*) self->tcp->tcp_loopback_peer; this->hiwat = this->peer->tcp_fuse_rcv_unread_hiwater; printf("pid: %d tcp_fuse_rcv_unread_hiwater: %d \\n", pid, this->hiwat); self->tcp = 0; this->peer = 0; this->hiwat = 0;}

In the past, when two processes were communicating using TCP on the same system, a lot of the TCP and IP protocol processing was performed just as it was for traffic to and from another system. Since...

Solaris Networking

Solaris 10 Zones and Networking -- Common Considerations

As often happens, a customer question resulted in this write-up. The customer had to quickly consider how they deploy a large number of zones on an M8000. They would be configuring up to twelve separate links for the different networks, and double that for IPMP. I wrote up the following. Thanks to Penny Cotten, Jim Eggers, Gordon Lythgoe, Peter Memishian, and Erik Nordmark for the feedback as I was preparing this. Also, you may see some of this in future documentation.DefinitionsDatalink: An interface at Layer 2 of the OSI protocol stack, which is represented in a system as a STREAMS DLPI (v2) interface. Such an interface can be plumbed under protocol stacks such as TCP/IP. In the context of Solaris 10 Zones, datalinks are physical interfaces (e.g. e1000g0, bge1), aggregations (aggr3), or VLAN-tagged interfaces (e1000g111000 (VLAN tag 111 on e1000g0), bge111001, aggr111003). A datalink may also be referred to as a physical interface, such as when referring to a Network Interface Card (NIC). The datalink is the 'physical' property configured with the zone configuration tool zonecfg(1M).Non-global Zone: A non-global zone is any zone, whether native or branded, that is configured, installed, and managed using the zonecfg(1M) and zoneadm(1M) commands in Solaris 10. A branded zone may be either Solaris 8 or Solaris 9.Zone network configuration: shared versus exclusive IP InstancesSince Solaris 10 8/07, zone configurations can be either in the default shared IP Instance or exclusive IP Instance configuration.When configured as shared, zone networking includes the following characteristics.All datalink and IP, TCP, UDP, SCTP, IPsec, etc. configuration is done in the global zone.All zones share the network configuration settings, including datalink, IP, TCP, UDP, etc. This includes ndd(1M) settings.All IP addresses, netmasks, and routes are set by the global zone and can not be altered in a non-global zone.Non-global zones can not utilize DHCP (neither client nor server). There is a work-around that may allow a zone to be a DHCP server.By default a privileged user in a non-global zone can not put a datalink into promiscuous mode, and thus can not run things like snoop(1M). Changing this requires adding the priv_net_raw privilege to the zone from the global zone, and also requires identifying which interface(s) to allow promiscuous mode on via the 'match' zonecfg parameter. Warning: This allows the non-global zone to send arbitraty packets on those interfaces.IPMP configuration is managed in the global zone and applies to all zones using the datalinks in the IPMP group. All non-global zones configured with one datalink can or must use all datalinks in the IPMP group. Non-global zones can use multiple IPMP groups. The zone must be configured with only one datalink from each IPMP group.Only default routes apply to the non-global zones, as determined by the IP address(es) assigned to the zone. Non-default static routes are not supported to direct traffic leaving a non-global zone.Multiple zones can share a datalink.When configured as exclusive, zone networking includes the following characteristics.All network configuration can be done within the non-global zone (and can also be done indirectly from the global zone (via zlogin(1) or editing the files in the non-global zone's root file system).IP and above configurations can not be seen directly within the global zone (e.g. running ifconfig(1M) in the global zone will not show the details of a non-global zone).The non-global zone's interface(s) can be configured via DHCP, and the zone can be a DHCP server.A privileged user in the non-global zone can fully manipulate IP address, netmask, routes, ndd variables, logical interfaces, ARP cache, IPsec policy and keys, IP Filter, etc.A privileged user in the non-global zone can put the assigned interface(s) into promiscuous mode (e.g. can run snoop).The non-global zone can have unique IPsec properties.IPMP must be managed within the non-global zone.A datalink can only be used by a single running zone at any one time.Commands such as snoop(1M) and dladm(1M) can be used on datalinks in use by running zones.It is possible to mix shared and exclusive IP zones on a system. All shared zones will be sharing the configuration and run time data (routes, ARP, IPsec) of the global zone. Each exclusive zone will have its own configuration and run time data, which can not be shared with the global zone or any other exclusive zones.IP Multipathing (IPMP)By default, all IPMP configurations are managed in the global zone and affects all non-global zones whose network configuration includes even one datalink (the net->physical property in zonecfg(1M)) in the IPMP group. A zone configured with a datalinks that are part of IPMP groups must only configure each IP address on only one of the datalinks in the IPMP group. It is not necessary to configure an IP address on each datalink in the group. The global zone's IPMP infrastructure will manage the fail-over and fail-back of datalinks on behalf of all the shared IP non-global zones.For exclusive IP zones, the IPMP configuration for a zone must be managed from within the non-global zone, either via the configuration files or zlogin(1).The choice to use probe-based failure detection or link-based failure detection can be done on a per-IPMP group basis, and does not affect whether the zone can be configured as shared or exclusive IP Instance. Care must be taken when selecting test IP addresses, since they will be configured in the global zone and thus may affect routing for either the global or for the non-global zones.Routing and ZonesThe normal case for shared-IP zones is that they use the same datalinks and the same IP subnet prefixes as the global zone. In that case the routing in the shared-IP zones are the same as in the global zone. The global zone can use static or dynamic routing to populate its routing table, that will be used by all the shared-IP zones.In some cases different zones need different IP routing. The best approach to accomplish this is to make those zones be exclusive-IP zones. If this is not possible, then one can use some limited support for routing differentiation across shared-IP zones. This limited support only handles static default routes, and only works reliably when the shared-IP zones use disjoint IP subnets.All routing is managed by zone that owns the IP Instance. The global zones owns the 'default' IP Instance that all shared IP zones use. Any exclusive IP zone manages the routes for just that zone. Different routing policies, routing daemons, and configurations can be used in each IP Instance.For shared IP zones, only default static routes are supported with those zones. If multiple default routes apply to a non-global zone, care must be taken that all the default routes are able to reach all the destinations that the zone need to reach. A round robin policy is used when multiple default routes are available and a new route needs to be determined.The zonecfg(1M) 'defrouter' property can be used to define a default router for a specific shared IP zone. When a zone is started and the parameter is set, a default route on the interface configured for that zone will be created if it does not already exist. As of Solaris 10 10/09, when a zone stops, the default route is not deleted.Default routes on the same datalink and IP subnet are shared across non-global zones. If a non-global zone is on the same datalink and subnet as the global zone, default route(s) configured for one zone will apply for all other zones on that datalink and IP subnet.Inter-zone network traffic isolationThere are several ways to restrict network traffic between non-global shared IP zones.The /dev/ip ndd(1M) paramter 'ip_restrict_interzone_loopback', managed from the global zone, will force traffic out of the system on a datalink if the source and destination zones do not share a datalink. The default configuration for this is to allow inter-zone networking using internal loopback of IP datagrams, with the value of this parameter set to '0'. When the value is set to '1', traffic to an IP address in another zone in the shared IP Instance that is not on the same datalink will be put onto the external network. Whether the destination is reached will depend on the full network configuration of the system and the external network. This applies whether the source and destination IP address are on the same or different IP subnets. This parameter applies to all IP Instances active on the system, including exclusive IP Instance zones. In the case of exclusive IP zones, this will apply only if the zone has more than one datalink configured with IP addresses.The for two zones on the same system to communicate with the 'ip_restrict_interzone_loopback' set to '1' requires the following conditions.There is a network path to the destination. If on the same subnet, the switch(es) must allow the connection. If on different subnets, routes must be in place for packets to pass reliably between the two zones.The destination address is not on the same datalink (as this would break the datalink rules).The destination is not on datalink in an IPMP group that the sending datalink is also in.The 'ip_restrict_interzone_loopback' parameter is available in Solaris 10 8/07 and later.A route(1M) action to prevent traffic between two IP addresses is available. Using the '-reject' flag will generate an ICMP unreachable when this route is attempted. The '-blackhole' flag will silently discard datagrams.The IP Filter action 'intercept_loopback' will filter traffic between sockets on a system, including traffic between zones and loopback traffic within a zone. Using this action prevents traffic between shared IP zones. It does not force traffic out of the system using a datalink. More information is in the ipf.conf(4) or ipf(4) manual page.AggregationsSolaris 10 1/06 and later support IEEE 802.3ad link aggregations using the dladm(1M) datalink administration command. Combining two or more datalinks into an aggregation effectively reduces the number of datalinks available. Thus it is important to consider the trade-offs between aggregations and IPMP when requiring either network availability or increased network bandwidth. Full traffic patterns must be understood as part of the decision making process.For the 'ce' NIC, Sun Trunking 1.3.1 is available for Solaris 10.Some considerations when making a decision between link aggregation and IPMP are the following.Link aggregation requires support and configuration of aggregations on both ends of the link, i.e. both the system and the switch.Most switches only support link aggregation within a switch, not spanning two or more switches.Traffic between a single pair of IP addresses will typically only utilize one link in either an aggregation or IPMP group.Link aggregation only provides availability between the switch ports and the system. IPMP using probe-based failure detection can redirect traffic around internal switch problems or network issues behind the switches.Multiple hashing policies are available, and they can be set differently for inbound and outbound traffic.IPMP probe-based failure detection required test addresses for each datalink in the IPMP group, which are in addition to the application or data address(es).IPMP link-based failure detection will cause a fail-over or fail-back based on link state only. Solaris 10 supports IPMP configured in only link-based mode. If IPMP is configured in probe-based failure detection, link failure will also cause fail-over, and a link restore will cause a fail-back.A physical interface can be in only one aggregation. VLANs can be configured over an aggregation.A datalink can be in only one IPMP group.An IPMP group can use aggregations as the underlying datalinks.Note, this is for Solaris 10. OpenSolaris has differences. Maybe something for another day.I hope this is helpful! Steffen

As often happens, a customer question resulted in this write-up. The customer had to quickly consider how they deploy a large number of zones on an M8000. They would be configuring up to...

Solaris Networking

My thoughts on configuring zones with shared IP instances and the 'defrouter' parameter

An occasional call or email I receive has questions about routing issues when using Solaris Zones in the (default) shared IP Instance configuration. Everything works well when the non-global zones are on the same IP subnet (lets say 172.16.1.0/24) as the global zone. Routing gets a little tricky when the non-global zones are on a different subnet.My general recommendation is to isolate. This means:Separate subnets for the global zone (administration, backup) and the non-global zones (applications, data).Separate data-links for the global and non-global zones.The non-global zones can share a data-linkNon-global zones on different IP subnets use different data-linksUsing separate data-links is not always possible. I was concerned whether this would actually work.So I did some testing, and exchanged some emails because of a comment I made regarding PSARC/2008/057 and the automatic removal of a default route when the zone is halted.Turns out I have been very restrictive in suggesting that the global and non-global zones not share a data-link. While I think that is a good administrative policy, to separate administrative and application traffic, it is not a requirement. It is OK to have the global zone and one or more non-global zones share the same data-link. However, if the non-global zones are to have different default routes, they must be on subnets that the global zone is not on.My test case running Solaris 10 10/09 has the global zone on the 129.154.53.0/24 network and the non-global zone on the 172.16.27.0/24 network. global# ifconfig -a...e1000g0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2 inet 129.154.53.132 netmask ffffff00 broadcast 129.154.53.255 ether 0:14:4f:ac:57:c4e1000g0:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2 zone shared1 inet 172.16.27.27 netmask ffffff00 broadcast 172.16.27.255global# zonecfg -z shared1 info netnet: address: 172.16.27.27/24 physical: e1000g0 defrouter: 172.16.27.16The routing table as seen from both are:global# netstat -rnRouting Table: IPv4 Destination Gateway Flags Ref Use Interface-------------------- -------------------- ----- ----- ---------- ---------default 129.154.53.215 UG 1 123default 172.16.27.16 UG 1 7 e1000g0129.154.53.0 129.154.53.132 U 1 50 e1000g0224.0.0.0 129.154.53.132 U 1 0 e1000g0127.0.0.1 127.0.0.1 UH 3 80 lo0shared1# netstat -rnRouting Table: IPv4 Destination Gateway Flags Ref Use Interface-------------------- -------------------- ----- ----- ---------- ---------default 172.16.27.16 UG 1 7 e1000g0172.16.27.0 172.16.27.27 U 1 3 e1000g0:1224.0.0.0 172.16.27.27 U 1 0 e1000g0:1127.0.0.1 127.0.0.1 UH 4 78 lo0:1While the global zone shows both routes, only the default applying to its subnet will be used. And for traffic leaving the non-global zone, only its default will be used.You may notice that the Interface for the global zone's default router is blank. That is because I have set the default route via /etc/defaultrouter. I noticed that if it is determined via the route discovery daemon, it will be listed as being on e1000g0! This does not affect the behavior, however it may be visually confusing, which is probably why I initially leaned towards saying to not share the data-link.There are multiple ways to determining which route might be used, including ping(1M) and traceroute(1M). I like the output of the route get command.global# route get 172.16.29.1 route to: 172.16.29.1destination: default mask: default gateway: 129.154.53.1 interface: e1000g0 flags: <UP,GATEWAY,DONE,STATIC> recvpipe sendpipe ssthresh rtt,ms rttvar,ms hopcount mtu expire 0 0 0 0 0 0 1500 0shared1# route get 172.16.28.1 route to: 172.16.28.1destination: default mask: default gateway: 172.16.27.16 interface: e1000g0:1 flags: <UP,GATEWAY,DONE,STATIC> recvpipe sendpipe ssthresh rtt,ms rttvar,ms hopcount mtu expire 0 0 0 0 0 0 1500 0This quickly shows which interfaces and IP addresses are being used. If there are multiple default routes, repeated invocations of this will show a rotation in the selection of the default routes.Thanks to Erik Nordmark and Penny Cotten for their insights on this topic!Steffen Weiberle

An occasional call or email I receive has questions about routing issues when using Solaris Zones in the (default) shared IP Instance configuration. Everything works well when the non-global zones...

Solaris Networking

VLANs and Aggregations

Every once in a while I see the question asking whether it is possible to use IEEE 802.1q VLANs together with IEEE 802.3ad Link Aggregation. I frequently have to check myself. So in order to better remind me, and share with others, here is a quick demonstration of how to get the two working together. My test system is running build 05 of the upcoming Solaris 10 10/09 (update 8). The system has four bge interfaces, and I will use numbers 1 and 2. (This should work just as well with previous updates of Solaris 10, and with Sun Trunking in Solaris 9, except for the zones parts. I am using zones just to isolate my traffic generation and easily get it to use a specific data link.)Starting out things like like this.global# dladm show-devbge0 link: up speed: 1000 Mbps duplex: fullbge1 link: unknown speed: 0 Mbps duplex: unknownbge2 link: unknown speed: 0 Mbps duplex: unknownbge3 link: unknown speed: 0 Mbps duplex: unknownglobal# dladm show-linkbge0 type: non-vlan mtu: 1500 device: bge0bge1 type: non-vlan mtu: 1500 device: bge1bge2 type: non-vlan mtu: 1500 device: bge2bge3 type: non-vlan mtu: 1500 device: bge3global# ifconfig -a4lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 inet 127.0.0.1 netmask ff000000bge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2 inet 129.154.53.125 netmask ffffff00 broadcast 129.154.53.255 ether 0:3:ba:e3:42:8bI have my switch set up to aggregate ports 1 and 2, and here is how I do it with Solaris 10.global# dladm create-aggr -d bge1 -d bge2 1global# dladm show-linkbge0 type: non-vlan mtu: 1500 device: bge0bge1 type: non-vlan mtu: 1500 device: bge1bge2 type: non-vlan mtu: 1500 device: bge2bge3 type: non-vlan mtu: 1500 device: bge3aggr1 type: non-vlan mtu: 1500 aggregation: key 1VLAN tagged interfaces are used by accessing the underlying data link by preceeding the data link ID with the VLAN tag. For bge1 and VLAN 111 that would be bge111001. For for aggr1 it would be aggr111001. For this setup I am using zones zone111 and zone112 configured as an exclusive IP Instance. The zone configuration look like this.global# zonecfg -z zone111 infozonename: zone111zonepath: /zones/zone111brand: nativeautoboot: falsebootargs:pool:limitpriv:scheduling-class:ip-type: exclusiveinherit-pkg-dir: dir: /libinherit-pkg-dir: dir: /platforminherit-pkg-dir: dir: /sbininherit-pkg-dir: dir: /usrnet: address not specified physical: aggr111001 defrouter not specifiedOnce configured, installed, and booted, the network configuration of zone111 is:global# zlogin zone111 ifconfig -a4lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 inet 127.0.0.1 netmask ff000000aggr111001: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 2 inet 172.16.111.141 netmask ffffff00 broadcast 172.16.111.255 ether 0:3:ba:e3:42:8cTurns out that configuring this was easy compared to showing that the link aggregation was really working. While the full list of links known when the zones are includes the aggregation and the VLANs on the aggregation, tools such a netstat or nicstat would not include them. As it turns out they only report on interfaces that are plumbed up in that IP Instance. It will not be possible to plumb either bge1 or bge2 since they are members of the aggregation.global# dladm show-linkbge0 type: non-vlan mtu: 1500 device: bge0bge1 type: non-vlan mtu: 1500 device: bge1bge2 type: non-vlan mtu: 1500 device: bge2bge3 type: non-vlan mtu: 1500 device: bge3aggr1 type: non-vlan mtu: 1500 aggregation: key 1aggr111001 type: vlan 111 mtu: 1500 aggregation: key 1aggr112001 type: vlan 112 mtu: 1500 aggregation: key 1global# netstat -iName Mtu Net/Dest Address Ipkts Ierrs Opkts Oerrs Collis Queuelo0 8232 loopback localhost 98 0 98 0 0 0bge0 1500 pinebarren pinebarren 43101 0 7181 0 0 0So I ended up using kstat(1M) to get the values of the number of outbound packets. I an interested in outbound as that is what Solaris can affect regarding distributing traffic across links in an aggregation--the switch determines that for inbound traffic. This example shows data on instance 2 of the bge interface for kstat value opackets.global# kstat -m bge -i 2 -s opacketsmodule: bge instance: 2name: mac class: net opackets 2542With kstat I can see that for different connections either bge1 or bge2 has packets going out on it. A good test for me was scp to a remote system. Neither ping nor traceroute caused the necessary hashing to use both links in the aggregation. Steffen

Every once in a while I see the question asking whether it is possible to use IEEE 802.1q VLANs together with IEEE 802.3ad Link Aggregation. I frequently have to check myself. So in order to better...

Solaris Networking

ssh and friends scp, sftp say "hello crypto!"

Solaris includes the SunSSH toolset (ssh, scp, and sftp) in Solaris 9 and later. Solaris 10 comes with the Solaris Cryptographic Framework that provides an easy mechanism for applications that use PKCS #11, OpenSSL, Java Security Extensions, or the NSS interface to take advantage of cryptographic hardware or software on the system.Separately, the UltraSPARC® T2 processor in the T-series (CMT) has built-in cyptographic processors (one per core, or typically eight per socket) that accelerate secure one-way hashes, public key session establishment, and private key bulk data transfers. The latter is useful for long standing connections and for larger data operations, such as a file transfer.Prior to Solaris 10 5/09, an scp or sftp file transfer operation had the encryption and decryption done the by the CPU. While usually this is not a big deal, as most CPUs do private key crypto reasonably fast, on the CMT systems these operations are relatively slow. Now with SunSSH With OpenSSL PKCS#11 Engine Support in 5/09, the SunSSH server and client will use the cryptographic framework when an UltraSPARC® T2 process nc2p cryptographic unit is available.To demonstrate this, I used a T5120 with Logical Domains (LDoms) 1.1 configured running Solaris 10 5/09. Using LDoms helps, as I can assign or remove crypto units on a per-LDom basis. (Since the crypto units are not supported yet with dynamic reconfiguration, a reboot of the LDom instance is required. However, in general, I don't see making that kind of change very often.)I did all the work in the 'primary' control and service LDom, where I have direct access to the network devices, and can see the LDom configuration. I am listing parts of it here, although this is about Solaris, SunSSH, and the crypto hardware.medford# ldm list-bindings primaryNAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIMEprimary active -n-cv- SP 16 8G 0.1% 22h 16mMAC 00:14:4f:ac:57:c4HOSTID 0x84ac57c4VCPU VID PID UTIL STRAND 0 0 0.6% 100% 1 1 1.9% 100% 2 2 0.0% 100% 3 3 0.0% 100% 4 4 0.0% 100% 5 5 0.1% 100% 6 6 0.0% 100% 7 7 0.0% 100% 8 8 0.7% 100% 9 9 0.1% 100% 10 10 0.0% 100% 11 11 0.0% 100% 12 12 0.0% 100% 13 13 0.0% 100% 14 14 0.0% 100% 15 15 0.0% 100%MAU ID CPUSET 0 (0, 1, 2, 3, 4, 5, 6, 7) 1 (8, 9, 10, 11, 12, 13, 14, 15)MEMORY RA PA SIZE 0x8000000 0x8000000 8GThe 'system' has 16 CPUs (hardware strands), two MAUs (those are the crypto units), and 8 GB of memory. I am using e1000g0 for the network and the remote system is a V210 running Solaris Express Community Edition snv_113 SPARC (OK, I am a little behind). The network is 1 GbE.The command I run issource#/usr/bin/time scp -i /.ssh/destination /large-file destination:/tmpsource# du -h /large-file 1.3G /large-fileMy results with the crypto units werereal 1:13.6user 32.2sys 34.5while without the crypto unitsreal 2:28.2user 2:10.9sys 26.8The transfer took one half the time and considerably less CPU processing with the crypto units in place (I have two although I think it is using only one since this is a single transfer).So, SunSSH benefits from the built-in cryptographic hardware in the UltraSPARC® T2 process!Steffen

Solaris includes the SunSSH toolset (ssh, scp, and sftp) in Solaris 9 and later. Solaris 10 comes with the Solaris Cryptographic Framework that provides an easy mechanism for applications that use...

Sun

Sun Shared Shell - A Cool Diagnostic Tool

[Updated 2010.10.12 with new URL] As part of helping a customer out recently on an escalation, the SSE on the case suggested using Sun Shared Shell, a tool that allows you to see and optionally control a remote system. It supports SSH and Telnet. This tool was instrumental in increasing my understanding of what was going on with the customer's system, and removed the need to wait for output via emails or just trying to understand things over the phone. The owner of the session, usually the customer, has the option of allowing you to enter commands (without hitting 'Return'), or even allowing the 'Return' as well. It also has logging and chatting capabilities. When first logging in, it allows you to be the owner of the shell and share that with other participants, or to view someone else's shell session. Once logged in, you have a terminal window, the people present on the connection, and a chat window. The icon before the name/email address shows whether you have view, type, or full control (the keyboard will also have a down-arrow with it). Oh, and I forgot about the feature to scribble on the screen. I used that to diagram out an idea I had to solve a zone networking issue, and it helped the others understand what I was proposing a lot quicker! In the spirit of 'asking for what you want instead of complaining about what you don't have', I submitted a few suggestions, and the owner(s) quickly responded with clarifications. I see this as a great tool to help future cases where a shared view of operations will improve understanding or service delivery! Thanks to those who created and maintain it!Steffen

[Updated 2010.10.12 with new URL] As part of helping a customer out recently on an escalation, the SSE on the case suggested using Sun Shared Shell, a tool that allows you to see and...

Solaris Networking

What happened to my packets? -- or -- Dual default routes and shared IP zones

I recently received a call from someone who has helped me out a lot on some performance issues (thanks, Jim Fiori), and I was glad to be able to return even a small part of those favors! He had been contacted to help a customer who was ready to deploy a web application, and they were experiencing intermittent lack of connection to the web site. Interestingly, they were also using zones, a bunch of them (OK, a handful)--and so right up my alley. The customer was running a multi-tiered web application on an x4600 (so Solaris on x86 as well!), with the web server, web router, and application tiers in different zones. They were using shared IP Instances, so all the network configuration was being done in the global zone. Initially, we had to modify some configuration parameters, especially regarding default routes. Since the system was installed with Solaris 10 5/08 and had more recent patches, we could use the defrouter feature introduced in 10/08 to make setting up routes for the non-global zones a little easier. This was needed because the global zone was using only one NIC, and it was not going to be on the networks that the non-global zones were on. What made the configuration a little unique was that the web server needs a default router to the Internet, while the application server needs a route to other systems behind a different router. Individually, everything is fine. However, the web1 zone also needs to be on the network that the application and web router are on, so it ends up having two interfaces. Lets look at web1 when only it is running. web1# ifconfig -a4lo0:1: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 inet 127.0.0.1 netmask ff000000bge1:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3 inet 172.16.1.41 netmask ffffff00 broadcast 172.16.1.255bge2:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4 inet 192.168.51.41 netmask ffffff00 broadcast 192.168.51.255web1# netstat -rnRouting Table: IPv4 Destination Gateway Flags Ref Use Interface-------------------- -------------------- ----- ----- ---------- ---------default 172.16.1.1 UG 1 0 bge1172.16.1.0 172.16.1.41 U 1 0 bge1:1192.168.51.0 192.168.51.41 U 1 0 bge2:1224.0.0.0 172.16.1.41 U 1 0 bge1:1127.0.0.1 127.0.0.1 UH 5 34 lo0:1The zone is on two interface, bge1 and bge2, and has a default route that uses bge1. However, when zone app1 is running, there is a second default route, on bge2. The same is true if app2 or odr are running. Note that these three zones are only on bge2. app1# ifconfig -a4lo0:1: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 inet 127.0.0.1 netmask ff000000bge2:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4 inet 192.168.51.43 netmask ffffff00 broadcast 192.168.51.255app1# netstat -rnRouting Table: IPv4 Destination Gateway Flags Ref Use Interface-------------------- -------------------- ----- ----- ---------- ---------default 192.168.51.1 UG 1 0 bge2192.168.51.0 192.168.51.43 U 1 0 bge2:1224.0.0.0 192.168.51.43 U 1 0 bge2:1127.0.0.1 127.0.0.1 UH 3 51 lo0:1In the meantime, this is what happens in web1. web1# netstat -rnRouting Table: IPv4 Destination Gateway Flags Ref Use Interface-------------------- -------------------- ----- ----- ---------- --------- default 192.168.51.1 UG 1 0 bge2default 172.16.1.1 UG 1 0 bge1 172.16.1.0 172.16.1.41 U 1 0 bge1:1192.168.51.0 192.168.51.41 U 1 0 bge2:4224.0.0.0 172.16.1.41 U 1 0 bge1:1127.0.0.1 127.0.0.1 UH 6 132 lo0:4With any of the other zones running, web1 now has two default routes. And it only happens in web1, as it is the only zone with its public facing data link bge1 and a shared data link (bge2). Traffic to any system on either the 192.168.51.0 or 172.16.1.1 network will have no issues. Every time IP needs to determine a new path for a system not on either of those two networks, it will pick a route, and it will round-robin between the two default routes. Thus approximately half the time, connections will fail to establish, or possibly existing connections will not work if they have been idle for a while. This is how IP is supposed to work, so there is technically nothing wrong. It is a features of zones and a shared IP Instance. [2009.06.23: For background on why IP works this way, see James' blog]. The only problem is that this is not what the customer wants! One option would be to force all traffic between the web and application tier out the bge1 interface, putting it on the wire. This may not be desirable for security reasons, and introduces latencies since traffic now goes on the wire. Another option would be to use exclusive IP Instances for the web servers. For each web zone, and this example only has one, it would required two additional data links (NICs). That would add up. Also, this configuration is targeted to be used with Solaris Cluster's scalable services, and those must be in shared IP Instance zones. Hummm....as I like to say. We didn't know about the shared IP Instance restriction of Solaris Cluster, and as the customer was considering how they were going to add additional NICs to all the systems, something slowly developed in my mind. How about creating a shared, dummy network between the web and application tier? They had one spare NIC, and with shared IP it does not even need to be connected to a switch port, since IP will loop all traffic back anyway! The more I thought about it, the more I liked it, and I could not see anything wrong with it. At least not technically as I understood Solaris. Operationally, for the customer, it might be a little awkward. Here is what I was thinking of... With this configuration the web1 zone has a default router only to the Internet and it can reach odr, and if necessary, app1 and app2, directly via the new network. And app1 and app2 only have a single default route to get to the Intranet. The nice thing is that bge3 does not even need to be up. That is visible with ifconfig output, where bge3 is not showing a RUNNING flag, which indicates the port is not connected (or in my case has been disabled on the switch). global# ifconfig -a4...bge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2 inet 129.154.53.125 netmask ffffff00 broadcast 129.154.53.255 ether 0:3:ba:e3:42:8bbge1: flags=1000842<BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3 inet 0.0.0.0 netmask 0 ether 0:3:ba:e3:42:8cbge2: flags=1000842<BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4 inet 0.0.0.0 netmask 0 ether 0:3:ba:e3:42:8d bge3: flags=1000802<BROADCAST,MULTICAST,IPv4> mtu 1500 index 5 inet 0.0.0.0 netmask 0 ether 0:3:ba:e3:42:8e...And within web1 there is now only one default route.web1# netstat -rnRouting Table: IPv4 Destination Gateway Flags Ref Use Interface-------------------- -------------------- ----- ----- ---------- --------- default 172.16.1.1 UG 1 17 bge1 172.16.1.0 172.16.1.41 U 1 2 bge1:1192.168.52.0 192.168.52.41 U 1 2 bge3:1224.0.0.0 172.16.1.41 U 1 0 bge1:1127.0.0.1 127.0.0.1 UH 4 120 lo0:1In the customer's case, multiple systems were being used, so the private networks were connected together so that a web zone on one system could access an odr zone on another. I am showing the simple, single system case since it is so convenient. If I were using Solaris Express Community Edition (SX-CE) or OpenSolaris 2009.06 Developer Builds, with the Crossbow bits and virtual NICs (VNICs) available, I wouldn't even have needed to use that physical interface. Both are available here. I hope this trick might help others out in the future. Steffen

I recently received a call from someone who has helped me out a lot on some performance issues (thanks, Jim Fiori), and I was glad to be able to return even a small part of those favors! He had been...

Solaris Networking

Using IPMP with link based failure detection

Solaris has had a feature to increase network availability called IP Multipathing (IPMP). Initially it required a test address on every data link in an IPMP group, where the test addresses were used as the source IP address to probe network elements for path availability. One of the benefits of probe-based failure detection is that it can extend beyond the directly connected link(s), and verify paths through the attached switch(es) to what typically is a router or other redundant element to provide available services. Having one IP address (whether a public or a private, non routable) per data link and also the separate address(es) for the application(s) turns out to be a lot of addresses to allocate and administer. And since the default of five probes spaced two seconds apart meant a failure would take at least ten (10) seconds to be detected, something more was needed. So in the Solaris 9 timeframe the ability to also do link based failure detection was delivered. It requires specific NICs whose driver has the ability to notify the system that a link has failed. The Introduction to IPMP in the Solaris 10 Systems Administrators Guide on IP Services lists the NICs that support link state notification. Solaris 10 supports configuring IPMP with only link based failure detection. global# more /etc/hostname.bge[12]::::::::::::::/etc/hostname.bge1::::::::::::::10.1.14.140/26 group ipmp1 up::::::::::::::/etc/hostname.bge2::::::::::::::group ipmp1 standby upOn system boot, there will be an indication on the console that since no test addresses are defined, probe-based failure detection is disabled. Apr 10 10:57:20 in.mpathd[168]: No test address configured on interface bge2; disabling probe-based failure detection on itApr 10 10:57:20 in.mpathd[168]: No test address configured on interface bge1; disabling probe-based failure detection on itLooking at the interfaces configured,global# ifconfig -a4lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 inet 127.0.0.1 netmask ff000000bge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2 inet 129.154.53.125 netmask ffffff00 broadcast 129.154.53.255 ether 0:3:ba:e3:42:8bbge1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3 inet 10.1.14.140 netmask ffffffc0 broadcast 10.1.14.191 groupname ipmp1 ether 0:3:ba:e3:42:8cbge1:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3 inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255bge2: flags=69000842<BROADCAST,RUNNING,MULTICAST,IPv4,NOFAILOVER,STANDBY,INACTIVE> mtu 0 index 4 inet 0.0.0.0 netmask 0 groupname ipmp1 ether 0:3:ba:e3:42:8dyou will notice that two of the three interfaces have no address (0.0.0.0). Also, the data address is on a physical interface on bge1. At the same time bge2 has the 0.0.0.0 address. On the failure of bge1,Apr 10 14:34:53 global bge: NOTICE: bge1: link downApr 10 14:34:53 global in.mpathd[168]: The link has gone down on bge1Apr 10 14:34:53 global in.mpathd[168]: NIC failure detected on bge1 of group ipmp1Apr 10 14:34:53 global in.mpathd[168]: Successfully failed over from NIC bge1 to NIC bge2global# ifconfig -a4lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 inet 127.0.0.1 netmask ff000000bge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2 inet 129.154.53.125 netmask ffffff00 broadcast 129.154.53.255 ether 0:3:ba:e3:42:8bbge1: flags=19000802<BROADCAST,MULTICAST,IPv4,NOFAILOVER,FAILED> mtu 0 index 3 inet 0.0.0.0 netmask 0 groupname ipmp1 ether 0:3:ba:e3:42:8cbge2: flags=21000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,STANDBY> mtu 1500 index 4 inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255 groupname ipmp1 ether 0:3:ba:e3:42:8dbge2:1: flags=21000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,STANDBY> mtu 1500 index 4 inet 10.1.14.140 netmask ffffffc0 broadcast 10.1.14.191the data address is migrated onto bge2:1. I find this a little confusing. However, I don't know any way around it on Solaris 10. The IPMP Re-architecture makes this a lot easier! Using Probe-based IPMP with non-global zonesConfiguring a shared IP Instance non-global zone and utilizing IPMP managed in the global zone is very easy. The IPMP configuration is very simple. Interface bge1 is active, and bge2 is in stand-by mode.global# more /etc/hostname.bge[12]::::::::::::::/etc/hostname.bge1::::::::::::::group ipmp1 up::::::::::::::/etc/hostname.bge2::::::::::::::group ipmp1 standby upMy zone configuration is:global# zonecfg -z zone1 infozonename: zone1zonepath: /zones/zone1brand: nativeautoboot: falsebootargs:pool:limitpriv:scheduling-class:ip-type: sharedinherit-pkg-dir: dir: /libinherit-pkg-dir: dir: /platforminherit-pkg-dir: dir: /sbininherit-pkg-dir: dir: /usrnet: address: 10.1.14.141/26 physical: bge1Prior to booting, the network configuration is:global# ifconfig -a4lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 inet 127.0.0.1 netmask ff000000lo0:1: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 zone zone1 inet 127.0.0.1 netmask ff000000bge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2 inet 129.154.53.125 netmask ffffff00 broadcast 129.154.53.255 ether 0:3:ba:e3:42:8bbge1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3 inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255 groupname ipmp1 ether 0:3:ba:e3:42:8cbge2: flags=21000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,STANDBY> mtu 1500 index 4 inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255 groupname ipmp1 ether 0:3:ba:e3:42:8dAfter booting, the network looks like this:global# ifconfig -a4lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 inet 127.0.0.1 netmask ff000000lo0:1: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 zone zone1 inet 127.0.0.1 netmask ff000000bge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2 inet 129.154.53.125 netmask ffffff00 broadcast 129.154.53.255 ether 0:3:ba:e3:42:8bbge1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3 inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255 groupname ipmp1 ether 0:3:ba:e3:42:8cbge1:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3 zone zone1 inet 10.1.14.141 netmask ffffffc0 broadcast 10.1.14.191bge2: flags=21000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,STANDBY> mtu 1500 index 4 inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255 groupname ipmp1 ether 0:3:ba:e3:42:8dSo a simple case for the use of IPMP, without the need for test addresses! Other IPMP configurations, such as more than two data links, or active-active, are also supported with link based failure detection. The more links involved, the more test addresses are saved with link based failure detection. Since writing this entry I was involved in a customer configuration where this is saving several hundred IP address and their management (such as avoiding duplicate address). That customer is willing to forgo the benefit of probes testing past the local switch port. Steffen

Solaris has had a feature to increase network availability called IP Multipathing (IPMP). Initially it required a test address on every data link in an IPMP group, where the test addresses were used...

Solaris Networking

IPMP Re-architecture is delivered

In the process of working on some zones and IPMP testing, I ran into a little difficulty. After probing for some insight, I was reminded by Peter Memishian that the IPMP Re-Architecture (part of Project Clearview) bits were going to be in Nevada/SXCE build 107, and that I could BFU the lastest bits onto an existing Nevada install. Well!!! [For Peter's own perspective of this, see his recent blog.] Since I was already playing with build 105 because the Crossbow features are now integrated, I decided to apply the IPMP bits to a 105 installation. [Note: The IPMP Re-architecture is expected to be in Solaris Express Community Edition (SX-CE) build 107 or so (due to be out early Feb 2009), and thus in OpenSolaris 2009.spring (I don't know what its final name will be. Early access to IPS packages for OpenSolaris 2008.11 should appear in the bi-weekly developer repository shortly after SX-CE has the feature included. There is no intention to back port the re-architecture to Solaris 10.] I am impressed! The bits worked right away, and once I got used to the slightly different way of monitoring IPMP, I really liked what I saw. Being accustomed to using IPMP on Solaris 10 and with Crossbow beta testing previous Nevada bits, I used the long-standing (Solaris 10 and prior) IPMP configuration style I am used to. For my testing, I am using link failure testing only, so no probe addresses are configured. [For examples of the new configuration format, see the section Using the New IPMP Configuration Style below. (15 Feb 2009)]global# cat /etc/hostname.bge1group sharedglobal# cat /etc/hostname.bge2group sharedglobal# cat /etc/hostname.bge3group shared standbyIn my test case bge1 and bge2 are active interfaces, and bge3 is a standby interface.global# ifconfig -a4bge0: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 2 inet 139.164.63.125 netmask ffffff00 broadcast 139.164.63.255 ether 0:3:ba:e3:42:8bbge1: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 3 inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255 groupname shared ether 0:3:ba:e3:42:8cbge2: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 4 inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255 groupname shared ether 0:3:ba:e3:42:8dbge3: flags=261000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,STANDBY,INACTIVE,CoS> mtu 1500 index 5 inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255 groupname shared ether 0:3:ba:e3:42:8eipmp0: flags=8201000842<BROADCAST,RUNNING,MULTICAST,IPv4,CoS,IPMP> mtu 1500 index 6 inet 0.0.0.0 netmask 0 groupname sharedYou will notice that all three interfaces are up and part of group shared. What is different from the old IPMP is that automatically another interface was created, with the flag IPMP. This is the interface that will be used for all the data IP addresses. Because I used the old format for the /etc/hostname.\* files, the backward compatibility of the new IPMP automatically created the ipmp0 interface and assigned it a name. If I wish to have control over that name, I must configure IPMP slightly differently. More on that later. The new command ipmpstat(1M) is also introduced to get enhanced information regarding the IPMP configuration. My test is really about using zones and IPMP, so here is what things look like when I bring up three zones that are also configured the traditional way, with network definitions using the bge interfaces. [Using the new format, I would replace bge with either ipmp0 (keep in mind that 0 (zero) is set dynamically) or shared. For more details on the new format, go to Using the New IPMP Configuration Style below. (15 Feb 2009)]global# for i in 1 2 3 \^Jdo\^J zonecfg -z shared${i} info net \^Jdonenet: address: 10.1.14.141/26 physical: bge1 defrouter: 10.1.14.129net: address: 10.1.14.142/26 physical: bge1 defrouter: 10.1.14.129net: address: 10.1.14.143/26 physical: bge2 defrouter: 10.1.14.129After booting the zones, note that the zones' IP addresses are on logical interfaces on ipmp0, not the previous way of being logical interfaces on bge.global# ifconfig -a4lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 inet 127.0.0.1 netmask ff000000lo0:1: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 zone shared1 inet 127.0.0.1 netmask ff000000lo0:2: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 zone shared2 inet 127.0.0.1 netmask ff000000lo0:3: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 zone shared3 inet 127.0.0.1 netmask ff000000bge0: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 2 inet 139.164.63.125 netmask ffffff00 broadcast 139.164.63.255 ether 0:3:ba:e3:42:8bbge1: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 3 inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255 groupname shared ether 0:3:ba:e3:42:8cbge2: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 4 inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255 groupname shared ether 0:3:ba:e3:42:8dbge3: flags=261000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,STANDBY,INACTIVE,CoS> mtu 1500 index 5 inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255 groupname shared ether 0:3:ba:e3:42:8eipmp0: flags=8201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS,IPMP> mtu 1500 index 6 zone shared1 inet 10.1.14.141 netmask ffffffc0 broadcast 10.1.14.191 groupname sharedipmp0:1: flags=8201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS,IPMP> mtu 1500 index 6 zone shared2 inet 10.1.14.142 netmask ffffffc0 broadcast 10.1.14.191ipmp0:2: flags=8201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS,IPMP> mtu 1500 index 6 zone shared3 inet 10.1.14.143 netmask ffffffc0 broadcast 10.1.14.191For address information, here are the pre and post boot ipmpstat outputs.global# ipmpstat -aADDRESS STATE GROUP INBOUND OUTBOUND0.0.0.0 down ipmp0 -- --global# ipmpstat -aADDRESS STATE GROUP INBOUND OUTBOUND10.1.14.143 up ipmp0 bge1 bge2 bge110.1.14.142 up ipmp0 bge2 bge2 bge110.1.14.141 up ipmp0 bge1 bge2 bge1What's really neat is that it shows which interface(s) are used for outbound traffic. A different interface will be selected for each new remote IP address. That is the level of outbound load spreading at this time.global# ipmpstat -gGROUP GROUPNAME STATE FDT INTERFACESipmp0 shared ok -- bge2 bge1 (bge3)There is no group difference before or after.global# ipmpstat -gGROUP GROUPNAME STATE FDT INTERFACESipmp0 shared ok -- bge2 bge1 (bge3)The FDT column lists the probe-based failure detection time, and is empty since that is disabled in this setup. bge3 is listed third and in parenthesis since that interface is not being used for data traffic at this time.global# ipmpstat -iINTERFACE ACTIVE GROUP FLAGS LINK PROBE STATEbge3 no ipmp0 is----- up disabled okbge2 yes ipmp0 ------- up disabled okbge1 yes ipmp0 --mb--- up disabled okAlso, there are no differences for interface status. In both cases bge1 is used from multicast and broadcast traffic, and bge3 is inactive and in standby mode.global# ipmpstat -iINTERFACE ACTIVE GROUP FLAGS LINK PROBE STATEbge3 no ipmp0 is----- up disabled okbge2 yes ipmp0 ------- up disabled okbge1 yes ipmp0 --mb--- up disabled okThe probe and target output is uninteresting in this setup as I don't have probe based failure detection on. I am including them for completeness.global# ipmpstat -pipmpstat: probe-based failure detection is disabledglobal# ipmpstat -tINTERFACE MODE TESTADDR TARGETSbge3 disabled -- --bge2 disabled -- --bge1 disabled -- --So lets see what happens on a link 'failure' as I turn of the switch port going to bge1. On the console, the indication is a link failure.Jan 15 14:49:07 global in.mpathd[210]: The link has gone down on bge1Jan 15 14:49:07 global in.mpathd[210]: IP interface failure detected on bge1 of group sharedThe various ipmpstat outputs reflect the failure of bge1 and failover to to bge3, which had been in standby mode, and to bge2. I had expected both IP addresses to end up on bge3. Instead, IPMP determines how to best spread the IPs across the available interfaces. The address output shows that .141 and .143 are now on bge3.global# ipmpstat -aADDRESS STATE GROUP INBOUND OUTBOUND10.1.14.143 up ipmp0 bge3 bge3 bge210.1.14.142 up ipmp0 bge2 bge3 bge210.1.14.141 up ipmp0 bge2 bge3 bge2The group status has changed, with bge1 now shown in brackets as it is in failed mode.global# ipmpstat -gGROUP GROUPNAME STATE FDT INTERFACESipmp0 shared degraded -- bge3 bge2 [bge1]The interface status makes it clear that bge1 is down. Broadcast and multicast is now handed by bge2.global# ipmpstat -iINTERFACE ACTIVE GROUP FLAGS LINK PROBE STATEbge3 yes ipmp0 -s----- up disabled okbge2 yes ipmp0 --mb--- up disabled okbge1 no ipmp0 ------- down disabled failedAs expected, the only difference in the ifconfig output is for bge1, showing that it is in failed state. The zones are continue to shown using the ipmp0 interface. This took me a little bit of getting used to. Before, ifconfig was sufficient to fully see what the state is. Now, I must use ipmpstat as well. global# ifconfig -a4...bge1: flags=211000803<UP,BROADCAST,MULTICAST,IPv4,FAILED,CoS> mtu 1500 index 3 inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255 groupname shared ether 0:3:ba:e3:42:8cbge2: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 4 inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255 groupname shared ether 0:3:ba:e3:42:8dbge3: flags=221000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,STANDBY,CoS> mtu 1500 index 5 inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255 groupname shared ether 0:3:ba:e3:42:8eipmp0: flags=8201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS,IPMP> mtu 1500 index 6 zone shared1 inet 10.1.14.141 netmask ffffffc0 broadcast 10.1.14.191 groupname sharedipmp0:1: flags=8201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS,IPMP> mtu 1500 index 6 zone shared2 inet 10.1.14.142 netmask ffffffc0 broadcast 10.1.14.191ipmp0:2: flags=8201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS,IPMP> mtu 1500 index 6 zone shared3 inet 10.1.14.143 netmask ffffffc0 broadcast 10.1.14.191"Repairing" the interface, things return to normal.Jan 15 15:13:03 global in.mpathd[210]: The link has come up on bge1Jan 15 15:13:03 global in.mpathd[210]: IP interface repair detected on bge1 of group sharedNote here only one IP address ended up getting moved back to bge1.global# ipmpstat -aADDRESS STATE GROUP INBOUND OUTBOUND10.1.14.143 up ipmp0 bge1 bge2 bge110.1.14.142 up ipmp0 bge2 bge2 bge110.1.14.141 up ipmp0 bge2 bge2 bge1Interface bge3 is back in standby mode.global# ipmpstat -gGROUP GROUPNAME STATE FDT INTERFACESipmp0 shared ok -- bge2 bge1 (bge3)All three interfaces are up, only two are active, and broadcast and multicast stayed on bge2 (no need to change that now).global# ipmpstat -iINTERFACE ACTIVE GROUP FLAGS LINK PROBE STATEbge3 no ipmp0 is----- up disabled okbge2 yes ipmp0 --mb--- up disabled okbge1 yes ipmp0 ------- up disabled okAs a further example of rebalancing of the IP address, here is what happens with four IP addresses spread across two interfaces.global# ipmpstat -aADDRESS STATE GROUP INBOUND OUTBOUND10.1.14.144 up ipmp0 bge2 bge2 bge110.1.14.143 up ipmp0 bge1 bge2 bge110.1.14.142 up ipmp0 bge2 bge2 bge110.1.14.141 up ipmp0 bge1 bge2 bge1Jan 15 16:19:09 global in.mpathd[210]: The link has gone down on bge1Jan 15 16:19:09 global in.mpathd[210]: IP interface failure detected on bge1 of group sharedglobal# ipmpstat -aADDRESS STATE GROUP INBOUND OUTBOUND10.1.14.144 up ipmp0 bge2 bge3 bge210.1.14.143 up ipmp0 bge3 bge3 bge210.1.14.142 up ipmp0 bge2 bge3 bge210.1.14.141 up ipmp0 bge3 bge3 bge2Jan 15 18:11:35 global in.mpathd[210]: The link has come up on bge1Jan 15 18:11:35 global in.mpathd[210]: IP interface repair detected on bge1 of group sharedglobal# ipmpstat -aADDRESS STATE GROUP INBOUND OUTBOUND10.1.14.144 up ipmp0 bge2 bge2 bge110.1.14.143 up ipmp0 bge1 bge2 bge110.1.14.142 up ipmp0 bge2 bge2 bge110.1.14.141 up ipmp0 bge1 bge2 bge1There is even spreading of the IP addresses across any two active interfaces. Using the New IPMP Configuration StyleIn the previous examples, I used the old style of configuring IPMP with the /etc/hostname.xyzN files. Those files should work on all older versions of Solaris as well as with the re-architecture bits. This section briefly covers the new format. A new file that is introduced is the hostname.ipmp-group configuration file. It must follow the same format as any other data link configuration, ASCII characters followed by a number. I will use the same group name as above; however, I have to add a number to the end--thus the group name will be shared0. If you don't have the trailing number, the old style of IPMP setup will be used. I create a file to define the IPMP group. Note that it contains only the keyword ipmp. global# cat /etc/hostname.shared0ipmpThe other files for the NICs reference the IPMP group name. global# cat /etc/hostname.bge1group shared0 upglobal# cat /etc/hostname.bge2group shared0 upglobal# cat /etc/hostname.bge3group shared0 standby upOne note that may not be obvious. I am not using the keyword -failover as I am not using test addresses. Thus the interfaces are also not listed as deprecated in the ifconfig output. global# ifconfig -a4lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 inet 127.0.0.1 netmask ff000000shared0: flags=8201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS,IPMP> mtu 1500 index 2 inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255 groupname shared0bge0: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 3 inet 139.164.63.125 netmask ffffff00 broadcast 139.164.63.255 ether 0:3:ba:e3:42:8bbge1: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 4 inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255 groupname shared0 ether 0:3:ba:e3:42:8cbge2: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 5 inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255 groupname shared0 ether 0:3:ba:e3:42:8dbge3: flags=261000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,STANDBY,INACTIVE,CoS> mtu 1500 index 6 inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255 groupname shared0 ether 0:3:ba:e3:42:8eAfter booting the zones, which are still configured to use bge1 or bge2, things look like this.global# ifconfig -a4lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 inet 127.0.0.1 netmask ff000000lo0:1: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 zone shared1 inet 127.0.0.1 netmask ff000000lo0:2: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 zone shared2 inet 127.0.0.1 netmask ff000000lo0:3: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 zone shared3 inet 127.0.0.1 netmask ff000000shared0: flags=8201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS,IPMP> mtu 1500 index 2 inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255 groupname shared0shared0:1: flags=8201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS,IPMP> mtu 1500 index 2 zone shared1 inet 10.1.14.141 netmask ffffffc0 broadcast 10.1.14.191shared0:2: flags=8201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS,IPMP> mtu 1500 index 2 zone shared2 inet 10.1.14.142 netmask ffffffc0 broadcast 10.1.14.191shared0:3: flags=8201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS,IPMP> mtu 1500 index 2 zone shared3 inet 10.1.14.143 netmask ffffffc0 broadcast 10.1.14.191bge0: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 3 inet 139.164.63.125 netmask ffffff00 broadcast 139.164.63.255 ether 0:3:ba:e3:42:8bbge1: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 4 inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255 groupname shared0 ether 0:3:ba:e3:42:8cbge2: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 5 inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255 groupname shared0 ether 0:3:ba:e3:42:8dbge3: flags=261000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,STANDBY,INACTIVE,CoS> mtu 1500 index 6 inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255 groupname shared0 ether 0:3:ba:e3:42:8eglobal# ipmpstat -aADDRESS STATE GROUP INBOUND OUTBOUND10.1.14.143 up shared0 bge1 bge2 bge110.1.14.142 up shared0 bge2 bge2 bge110.1.14.141 up shared0 bge1 bge2 bge10.0.0.0 up shared0 -- --global# ipmpstat -gGROUP GROUPNAME STATE FDT INTERFACESshared0 shared0 ok -- bge2 bge1 (bge3)global# ipmpstat -iINTERFACE ACTIVE GROUP FLAGS LINK PROBE STATEbge3 no shared0 is----- up disabled okbge2 yes shared0 ------- up disabled okbge1 yes shared0 --mb--- up disabled okThings are the same as before, except that the I now have specified the IPMP group name (shared0 instead of the previous ipmp0). I find this very useful as the name can help identify the purpose, and when debugging, different IPMP group names using context appropriate text should be very helpful. I find the integration, or rather the backward compatibility, great. Not only will the old or existing IPMP setup work, the existing zonecfg network setup works as well. This means the same configuration files will work pre- and post-re-architecture! Let's take a look at how things look within a zone. shared1# ifconfig -a4lo0:1: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 inet 127.0.0.1 netmask ff000000shared0:1: flags=8201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS,IPMP> mtu 1500 index 2 inet 10.1.14.141 netmask ffffffc0 broadcast 10.1.14.191shared1# netstat -rnf inetRouting Table: IPv4 Destination Gateway Flags Ref Use Interface-------------------- -------------------- ----- ----- ---------- ---------default 10.1.14.129 UG 1 2 shared010.1.14.128 10.1.14.141 U 1 0 shared0:1127.0.0.1 127.0.0.1 UH 1 33 lo0:1The zone's network is on the link shared0 using a logical IP, and everything else looks as it has always looked. This output is actually while bge1 is down. IPMP hides all the details in the non-global zone. Using Probe-based FailoverThe configurations so far have been with link-based failure detection. IPMP has the ability to do probe-based failure detection, where ICMP packet are sent to other nodes on the system. This allows for failure detection way beyond what link-based detection can do, including the whole switch, and items past it up to and including routers. In order to use probe-based failure detection, test addresses are required on the physical NICs. For my configuration, I use test addresses on a completely different subnet, and my router is another system running Solaris 10. The router happens to be a zone with two NICs and configured as an exclusive IP Instance. I am using a completely different subnet as I want to isolate the global zone from the non-global zones, and the setup is also using the defrouter zonecfg option, and I don't want to interfere with that setup. The IPMP setup is as follows. I have added test addresses on the 172.16.10.0/24 subnet, and the interfaces are set to not fail over. global# cat /etc/hostname.shared0ipmpglobal# cat /etc/hostname.bge1172.16.10.141/24 group shared0 -failover upglobal# cat /etc/hostname.bge2172.16.10.142/24 group shared0 -failover upglobal# cat /etc/hostname.bge3172.16.10.143/24 group shared0 -failover standby upThis is the state of the system before bringing up any zones.global# ifconfig -a4lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 inet 127.0.0.1 netmask ff000000shared0: flags=8201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS,IPMP> mtu 1500 index 2 inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255 groupname shared0bge0: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 3 inet 139.164.63.125 netmask ffffff00 broadcast 139.164.63.255 ether 0:3:ba:e3:42:8bbge1: flags=209040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,CoS> mtu 1500 index 4 inet 172.16.10.141 netmask ffffff00 broadcast 172.16.10.255 groupname shared0 ether 0:3:ba:e3:42:8cbge2: flags=209040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,CoS> mtu 1500 index 5 inet 172.16.10.142 netmask ffffff00 broadcast 172.16.10.255 groupname shared0 ether 0:3:ba:e3:42:8dbge3: flags=269040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,STANDBY,INACTIVE,CoS> mtu 1500 index 6 inet 172.16.10.143 netmask ffffff00 broadcast 172.16.10.255 groupname shared0 ether 0:3:ba:e3:42:8eThe ipmpstat output is different now.global# ipmpstat -aADDRESS STATE GROUP INBOUND OUTBOUND0.0.0.0 up shared0 -- --global# ipmpstat -gGROUP GROUPNAME STATE FDT INTERFACESshared0 shared0 ok 10.00s bge2 bge1 (bge3)global# ipmpstat -iINTERFACE ACTIVE GROUP FLAGS LINK PROBE STATEbge3 no shared0 is----- up ok okbge2 yes shared0 ------- up ok okbge1 yes shared0 --mb--- up ok okThe Failure Detection Time is now set. And the probe information option lists an ongoing update of the probe results.global# ipmpstat -pTIME INTERFACE PROBE NETRTT RTT RTTAVG TARGET0.14s bge3 426 0.48ms 0.56ms 0.68ms 172.16.10.160.24s bge2 426 0.50ms 0.98ms 0.74ms 172.16.10.160.26s bge1 424 0.42ms 0.71ms 1.72ms 172.16.10.161.38s bge1 425 0.42ms 0.50ms 1.57ms 172.16.10.161.79s bge2 427 0.54ms 0.86ms 0.76ms 172.16.10.161.93s bge3 427 0.45ms 0.53ms 0.66ms 172.16.10.162.79s bge1 426 0.38ms 0.56ms 1.44ms 172.16.10.162.85s bge2 428 0.34ms 0.41ms 0.71ms 172.16.10.163.15s bge3 428 0.44ms 4.55ms 1.14ms 172.16.10.16\^CThe target information option shows the current probe targets.global# ipmpstat -tINTERFACE MODE TESTADDR TARGETSbge3 multicast 172.16.10.143 172.16.10.16bge2 multicast 172.16.10.142 172.16.10.16bge1 multicast 172.16.10.141 172.16.10.16Once the zones are up and running and bge1 is down, the status output changes accordingly.global# ipmpstat -aADDRESS STATE GROUP INBOUND OUTBOUND10.1.14.143 up shared0 bge2 bge3 bge210.1.14.142 up shared0 bge3 bge3 bge210.1.14.141 up shared0 bge2 bge3 bge20.0.0.0 up shared0 -- --global# ipmpstat -gGROUP GROUPNAME STATE FDT INTERFACESshared0 shared0 degraded 10.00s bge3 bge2 [bge1]global# ipmpstat -iINTERFACE ACTIVE GROUP FLAGS LINK PROBE STATEbge3 yes shared0 -s----- up ok okbge2 yes shared0 --mb--- up ok okbge1 no shared0 ------- down failed failedglobal# ipmpstat -pTIME INTERFACE PROBE NETRTT RTT RTTAVG TARGET0.46s bge2 839 0.43ms 0.98ms 1.17ms 172.16.10.161.15s bge3 840 0.32ms 0.37ms 0.65ms 172.16.10.161.48s bge2 840 0.37ms 0.45ms 1.08ms 172.16.10.162.56s bge3 841 0.45ms 0.54ms 0.63ms 172.16.10.163.17s bge2 841 0.40ms 0.51ms 1.01ms 172.16.10.163.93s bge3 842 0.40ms 0.47ms 0.61ms 172.16.10.164.61s bge2 842 0.63ms 0.75ms 0.98ms 172.16.10.165.17s bge3 843 0.38ms 0.46ms 0.59ms 172.16.10.165.72s bge2 843 0.36ms 0.44ms 0.91ms 172.16.10.16\^Cglobal# ipmpstat -tINTERFACE MODE TESTADDR TARGETSbge3 multicast 172.16.10.143 172.16.10.16bge2 multicast 172.16.10.142 172.16.10.16bge1 multicast 172.16.10.141 172.16.10.16Without showing the details here, the non-global zones continue to function. Bringing all three interfaces down, things look like this. Jan 19 13:51:22 global in.mpathd[61]: The link has gone down on bge2Jan 19 13:51:22 global in.mpathd[61]: IP interface failure detected on bge2 of group shared0Jan 19 13:52:04 global in.mpathd[61]: The link has gone down on bge3Jan 19 13:52:04 global in.mpathd[61]: All IP interfaces in group shared0 are now unusableglobal# ipmpstat -aADDRESS STATE GROUP INBOUND OUTBOUND10.1.14.143 up shared0 -- --10.1.14.142 up shared0 -- --10.1.14.141 up shared0 -- --0.0.0.0 up shared0 -- --global# ipmpstat -gGROUP GROUPNAME STATE FDT INTERFACESshared0 shared0 failed 10.00s [bge3 bge2 bge1]global# ipmpstat -iINTERFACE ACTIVE GROUP FLAGS LINK PROBE STATEbge3 no shared0 -s----- down failed failedbge2 no shared0 ------- down failed failedbge1 no shared0 ------- down failed failedglobal# ipmpstat -p\^Cglobal# ipmpstat -tINTERFACE MODE TESTADDR TARGETSbge3 multicast 172.16.10.143 --bge2 multicast 172.16.10.142 --bge1 multicast 172.16.10.141 --The whole IPMP group shared0 is down, all appropriate ipmpstat output reflects that, and no probes are listed nor probe RTT time reports are updated. An additional scenario might be to have two separate paths, and have something other than a link failure force the failover.

In the process of working on some zones and IPMP testing, I ran into a little difficulty. After probing for some insight, I was reminded by Peter Memishian that the IPMP Re-Architecture (part of Projec...

Solaris Networking

Using zonecfg defrouter with shared-IP zones

[Update to IPMP testing 2009.01.20] [Minor update 2009.01.14] When running Solaris Zones in a shared-IP configuration, all network configurations are determined by how the zone is configured using zonecfg(1M) or by what the global zone's IP determines things should be (such as routes). This has caused some trouble in situations where zones are on different subnets, and especially if the global zone is not on the subnet(s) the non-global zones are on. While exclusive IP Instances were delivered to help address these cases, using exclusive IP Instances requires a data link per zone, and if running a large number of zones there may not be enough data links available. With Solaris 10 10/08 (Update 6), an additional network configuration parameter is available for shared-IP zones. This is the default router (defrouter) optional parameter. Using the defrouter parameter, it is possible to set which router to use for traffic leaving the zone. In the global zone, default router entries are created the first time the zone is booted. Note that the entries are not deleted when the zone is halted. The defrouter property looks like this for a zone with it configured.global# zonecfg -z shared1 info netnet: address: 10.1.14.141/26 physical: bge1 defrouter: 10.1.14.129And it looks like this if it is not set.global# zonecfg -z shared1 info netnet: address: 10.1.14.141/26 physical: bge1 defrouter not specifiedSo I have run a variety of configurations, and some thing I observed are as follows. (Most of the configurations used a separate interface for the global zone (bge0) than for the non-global zones (bge1 and bge2). IPMP is not being used in these configurations. A comment on that at the end.) The [#] indicate examples in the outputs that follow.A default route entry is create for the NIC [1] on which the zone is configured when the zone is booted. [2]Entries are not deleted when a zone is halted. They persist until manually removed[3] or a reboot of the global zone.It is possible to have the same default router configured for multiple zones. [4]It is possible to have the same default router listed on multiple interfaces. \* [5]It is possible to have multiple default routers on the same interface, even on different IP subnets. [6]The interface used for outbound traffic is the one the zone is assigned to. [7]It is sufficient to plumb the interface for the non-global zones in the global zone (thus it has 0.0.0.0 as its IP address in the global zone). [8]The physical interface can be down in the global zone. [9]If only one interface is used, and different subnets for the global and non-global zones are configured, routing works when setting defrouter [10] and does not work if it is not set.The most interesting thing I noticed was that although two non-global zones may be on the same IP subnet, if they are configured on different interfaces, the traffic leaves the system on the interface that the zone is configured to be on. This is not the case typically when using shared IP and also having an IP address for the subnet in the global zone. \* Note: Having two interfaces on the same IP subnet without configuring IP Multipathing (IPMP) may not be a supported configuration. I am looking for documentation that states this one way or another. [2009.01.14] Examples1. Single Zone, Single Interface--The Basics Create a single non-global zone.global# netstat -rnRouting Table: IPv4 Destination Gateway Flags Ref Use Interface-------------------- -------------------- ----- ----- ---------- ---------default 139.164.63.215 UG 1 2 bge0139.164.63.0 139.164.63.125 U 1 1 bge0224.0.0.0 139.164.63.125 U 1 0 bge0127.0.0.1 127.0.0.1 UH 1 42 lo0global# zonecfg -z shared1 info netnet: address: 10.1.14.141/26 physical: bge1 defrouter: 10.1.14.129global# zoneadm -z shared1 boot [2]global# netstat -rnRouting Table: IPv4 Destination Gateway Flags Ref Use Interface-------------------- -------------------- ----- ----- ---------- ---------default 139.164.63.215 UG 1 2 bge0default 10.1.14.129 UG 1 0 bge1 [1]139.164.63.0 139.164.63.125 U 1 1 bge0224.0.0.0 139.164.63.125 U 1 0 bge0127.0.0.1 127.0.0.1 UH 1 42 lo0global# zoneadm -z shared1 haltglobal# zoneadm list -v ID NAME STATUS PATH BRAND IP 0 global running / native sharedglobal# netstat -rnRouting Table: IPv4 Destination Gateway Flags Ref Use Interface-------------------- -------------------- ----- ----- ---------- ---------default 10.1.14.129 UG 1 0 bge1default 139.164.63.215 UG 1 1 bge0139.164.63.0 139.164.63.125 U 1 1 bge0224.0.0.0 139.164.63.125 U 1 0 bge0127.0.0.1 127.0.0.1 UH 1 42 lo0global# route delete default 10.1.14.129 [3]delete net default: gateway 10.1.14.129global# netstat -rnRouting Table: IPv4 Destination Gateway Flags Ref Use Interface-------------------- -------------------- ----- ----- ---------- ---------default 139.164.63.215 UG 1 1 bge0139.164.63.0 139.164.63.125 U 1 1 bge0224.0.0.0 139.164.63.125 U 1 0 bge0127.0.0.1 127.0.0.1 UH 1 42 lo02. Multiple Interfaces, Same Default Router Three zones, where two use bge1 and the third uses bge2. All use the same default router.global# netstat -rnRouting Table: IPv4 Destination Gateway Flags Ref Use Interface-------------------- -------------------- ----- ----- ---------- ---------default 139.164.63.215 UG 1 1 bge0139.164.63.0 139.164.63.125 U 1 1 bge0224.0.0.0 139.164.63.125 U 1 0 bge0127.0.0.1 127.0.0.1 UH 1 42 lo0global# zonecfg -z shared1 info netnet: address: 10.1.14.141/26 physical: bge1 defrouter: 10.1.14.129 [4]global# zonecfg -z shared2 info netnet: address: 10.1.14.142/26 physical: bge1 defrouter: 10.1.14.129 [4]global# zonecfg -z shared3 info netnet: address: 10.1.14.143/26 physical: bge2 defrouter: 10.1.14.129 [5]global# zoneadm -z shared1 bootglobal# zoneadm -z shared2 bootglobal# zoneadm -z shared3 bootglobal# netstat -rnRouting Table: IPv4 Destination Gateway Flags Ref Use Interface-------------------- -------------------- ----- ----- ---------- ---------default 10.1.14.129 UG 1 0 bge1 [4]default 139.164.63.215 UG 1 1 bge0default 10.1.14.129 UG 1 2 bge2 [5]139.164.63.0 139.164.63.125 U 1 1 bge0224.0.0.0 139.164.63.125 U 1 0 bge0127.0.0.1 127.0.0.1 UH 1 42 lo0global# zoneadm list -v ID NAME STATUS PATH BRAND IP 0 global running / native shared 3 shared1 running /zones/shared1 native shared 4 shared2 running /zones/shared2 native shared 5 shared3 running /zones/shared3 native sharedglobal# ifconfig -a4lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 inet 127.0.0.1 netmask ff000000lo0:1: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 zone shared1 inet 127.0.0.1 netmask ff000000lo0:2: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 zone shared2 inet 127.0.0.1 netmask ff000000lo0:3: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 zone shared3 inet 127.0.0.1 netmask ff000000bge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2 inet 139.164.63.125 netmask ffffff00 broadcast 139.164.63.255 ether 0:3:ba:e3:42:8bbge1: flags=1000842<BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3 inet 0.0.0.0 netmask 0 ether 0:3:ba:e3:42:8cbge1:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3 zone shared1 inet 10.1.14.141 netmask ffffffc0 broadcast 10.1.14.191bge1:2: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3 zone shared2 inet 10.1.14.142 netmask ffffffc0 broadcast 10.1.14.191bge2: flags=1000842<BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4 inet 0.0.0.0 netmask 0 ether 0:3:ba:e3:42:8dbge2:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4 zone shared3 inet 10.1.14.143 netmask ffffffc0 broadcast 10.1.14.1913. Multiple Subnets Add another zone, using bge2 and on a different subnet.global# zonecfg -z shared4 info netnet: address: 192.168.16.144/24 physical: bge2 defrouter: 192.168.16.129global# zoneadm -z shared4 bootglobal# netstat -rnRouting Table: IPv4 Destination Gateway Flags Ref Use Interface-------------------- -------------------- ----- ----- ---------- ---------default 10.1.14.129 UG 1 0 bge1default 10.1.14.129 UG 1 4 bge2default 139.164.63.215 UG 1 3 bge0default 192.168.16.129 UG 1 0 bge2 [6]139.164.63.0 139.164.63.125 U 1 4 bge0224.0.0.0 139.164.63.125 U 1 0 bge0127.0.0.14. Interface Usage Issue some pings within the non-global zones and see which network interfaces are used. From the global zone, I issue a ping to a remote system (on the same network as the global zone (139.164.63.0), and see which interfaces are being used. [7]global# zlogin shared1 ping 139.164.63.38139.164.63.38 is aliveglobal# zlogin shared2 ping 139.164.63.38139.164.63.38 is aliveglobal# zlogin shared3 ping 139.164.63.38139.164.63.38 is aliveglobal# zlogin shared4 ping 139.164.63.38139.164.63.38 is aliveThis shows the pings originating from shared1 and shared2 going out on bge1.global1# snoop -d bge1 icmpUsing device /dev/bge1 (promiscuous mode) 10.1.14.141 -> 139.164.63.38 ICMP Echo request (ID: 4677 Sequence number: 0)139.164.63.38 -> 10.1.14.141 ICMP Echo reply (ID: 4677 Sequence number: 0) 10.1.14.142 -> 139.164.63.38 ICMP Echo request (ID: 4681 Sequence number: 0)139.164.63.38 -> 10.1.14.142 ICMP Echo reply (ID: 4681 Sequence number: 0)And this shows the pings originating from shared3 and shared4 going out on bge2.global2# snoop -d bge2 icmpUsing device /dev/bge2 (promiscuous mode) 10.1.14.143 -> 139.164.63.38 ICMP Echo request (ID: 4685 Sequence number: 0)139.164.63.38 -> 10.1.14.143 ICMP Echo reply (ID: 4685 Sequence number: 0)192.168.16.144 -> 139.164.63.38 ICMP Echo request (ID: 4689 Sequence number: 0)139.164.63.38 -> 192.168.16.144 ICMP Echo reply (ID: 4689 Sequence number: 0)Just to confirm where each zone is configured, here is the ifconfig output.global# ifconfig -a4lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 inet 127.0.0.1 netmask ff000000lo0:1: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 zone shared1 inet 127.0.0.1 netmask ff000000lo0:2: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 zone shared2 inet 127.0.0.1 netmask ff000000lo0:3: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 zone shared3 inet 127.0.0.1 netmask ff000000lo0:4: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 zone shared4 inet 127.0.0.1 netmask ff000000bge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2 inet 139.164.63.125 netmask ffffff00 broadcast 139.164.63.255 ether 0:3:ba:e3:42:8bbge1: flags=1000842<BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3 [9] inet 0.0.0.0 netmask 0 [8] ether 0:3:ba:e3:42:8cbge1:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3 zone shared1 inet 10.1.14.141 netmask ffffffc0 broadcast 10.1.14.191bge1:2: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3 zone shared2 inet 10.1.14.142 netmask ffffffc0 broadcast 10.1.14.191bge2: flags=1000842<BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4 inet 0.0.0.0 netmask 0 [8] ether 0:3:ba:e3:42:8dbge2:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4 zone shared3 inet 10.1.14.143 netmask ffffffc0 broadcast 10.1.14.191bge2:2: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4 zone shared4 inet 192.168.16.144 netmask ffffff00 broadcast 192.168.16.2555. Using a Single Interface Only using bge0 and using different subnets for the global and non-global zones. [10]Before booting the zone.global# netstat -nrRouting Table: IPv4 Destination Gateway Flags Ref Use Interface-------------------- -------------------- ----- ----- ---------- ---------default 139.164.63.215 UG 1 2 bge0139.164.63.0 139.164.63.125 U 1 2 bge0224.0.0.0 139.164.63.125 U 1 0 bge0127.0.0.1 127.0.0.1 UH 1 42 lo0global# zonecfg -z shared17 info netnet: address: 192.168.17.147/24 physical: bge0 defrouter: 192.168.17.16global# zoneadm -z shared17 bootOnce the zone is booted, netstat shows both default routes, and a ping from the zone works.global# netstat -rnRouting Table: IPv4 Destination Gateway Flags Ref Use Interface-------------------- -------------------- ----- ----- ---------- ---------default 139.164.63.215 UG 1 2 bge0default 192.168.17.16 UG 1 0 bge0139.164.63.0 139.164.63.125 U 1 2 bge0224.0.0.0 139.164.63.125 U 1 0 bge0127.0.0.1 127.0.0.1 UH 1 42 lo0global# ifconfig -a4lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 inet 127.0.0.1 netmask ff000000lo0:1: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 zone shared17 inet 127.0.0.1 netmask ff000000bge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2 inet 139.164.63.125 netmask ffffff00 broadcast 139.164.63.255 ether 0:3:ba:e3:42:8bbge0:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2 zone shared17 inet 192.168.17.147 netmask ffffff00 broadcast 192.168.17.255global# zlogin shared17 ping 139.164.63.38139.164.63.38 is aliveIP Multipathing (IPMP)I did some testing with IPMP and similar examples as above. At this time the combination of IPMP and the defrouter configuration does not work. I have filed bug 6792116 to have this looked at. [Updated 2009.01.20]After some addtional testing, especially with test addresses and probe based failure detection, I have seen IPMP work well only when zones are configured such that at least one zone is on each NIC in an IPMP group, including a standby NIC. For example, if you have two NICs, bge1 and bge2, at least one zone must be configured on bge1 and at least one on bge2. This is even the case when one of the NICs is in failed mode when the system or zone(s) boot. It turns out that the default route is added when the zone boot, and there is no later check for default route requirements as a zone is moved from one NIC to another based on IPMP failover or failback. Thus, I would recommend not using defrouter and IPMP together until the conbination is confirmed to work. If this is important for your deployments, please add a service record to change request 6792116 and work with your service provide to have this addressed. Please also note that this works well with the IPMP Re-architecture coming soon to OpenSolaris.

[Update to IPMP testing 2009.01.20] [Minor update 2009.01.14] When running Solaris Zones in a shared-IP configuration, all network configurations are determined by how the zone is configured using...

Solaris Networking

Crossbow is delivered--Traveling VNICs and more

With Solaris Express Community Edition build 105, the initial implementation of Network Virtualization and Resource Control, known as Project Crossbow, is delivered into the main networking code base and available in the distributed images. No need to install additional software! The multi-year effort has reached a major milestone. The feature I have been waiting for the most is the virtual NICs (VNICs). This allows me to create multiple data links using a single physical network interface, such as on my laptop. Each data link can be assigned to a different zone, and with exclusive IP Instance zones, each zone can have separate IP management and characteristics. The most useful one for me is to have one zone working on the native local network, and another zone with IPsec enabled, for a VPN connection. Previously, I have demonstrated how to do this with two NICs and with one NIC and VNICs. I also have an example of how to achieve this with VNANs. Now that Crossbow is integrated, things are much simpler! Some SpecificsFirst thing I did was create a VNIC. Note that the dladm(1M) commands have changed slightly, both general and for VNICs. To see what physical NICs are available. On my laptop it looks like this. (The option used to be show-dev.) global# dladm show-physLINK MEDIA STATE SPEED DUPLEX DEVICEath0 WiFi down 0 unknown ath0bge0 Ethernet up 1000 full bge0Data links are the entities that can be assigned to a zone, so lets see those.global# dladm show-linkLINK CLASS MTU STATE OVERath0 phys 1500 down --bge0 phys 1500 up --Now I create a VNIC.global# dladm create-vnic -l bge0 vpn0global# dladm show-linkLINK CLASS MTU STATE OVERath0 phys 1500 down --bge0 phys 1500 up --vpn0 vnic 1500 up bge0I used the basic create-vnic format, where I only specified the option over which device to create the VNIC. I let Solaris determine the MAC address, and I did not assign any other properties to the VNIC. The name for a data link must start with characters and end with a number. Thus I chose vpn0 to make it clear to me what I want to use it for. I could have called it vpn123456789, showing that the number part can be quite large. I now create a zone, and I chose the following configuration.global# zonecfg -z vpn infozonename: vpnzonepath: /zones/vpnbrand: nativeautoboot: falsebootargs:pool:limitpriv:scheduling-class:ip-type: exclusiveinherit-pkg-dir: dir: /libinherit-pkg-dir: dir: /platforminherit-pkg-dir: dir: /sbininherit-pkg-dir: dir: /usrnet: address not specifiedphysical: vpn0 defrouter not specifiedKey items are in bold. The zone is an exclusve IP Instance zone, and I only assigned the vpn0 data link to it. The zone is a sparse zone, and the need to inherit an extra directory for IPsec to work is no longer required (I was curious whether this had been fixed.)After installing (I made a clone of an existing zone) and before booting the zone, I copied into the zone a customized sysidcfg file.global# cat /zones/vpn/root/etc/sysidcfgsystem_locale=Cterminal=xtermnetwork_interface=PRIMARY {dhcp protocol_ipv6=no}nfs4_domain=dynamicsecurity_policy=NONEname_service=NONEtimezone=US/Easternservice_profile=limited_nettimeserver=localhostroot_password=YyDStVVvtZX6.Upon booting, the zone gets an IP address via DHCP. This will be useful for being on a variety of networks. When using wireless, I won't have to change the zone's configuration. I will, however, have to recreate vpn0 on top of ath0.Now I can happily be on a public and the corporate network at the same time. This example has me using the non-global zone to run VPN within. However, depending on my needs at the moment, I could have the global zone be VPNed in, and the non-global zone be on the public network. It is just a matter of where I run the VPN software.global# ifconfig -a4lo0: flags=2001000849 mtu 8232 index 1 inet 127.0.0.1 netmask ff000000ath0: flags=201000802 mtu 1500 index 2 inet 0.0.0.0 netmask 0 ether 0:b:6b:80:bc:59bge0: flags=201004843 mtu 1500 index 3 inet 192.168.15.104 netmask ffffff00 broadcast 192.168.15.255 ether 0:c0:9f:5b:43:33vpn# ifconfig -a4lo0: flags=2001000849 mtu 8232 index 1 inet 127.0.0.1 netmask ff000000vpn0: flags=201004843 mtu 1500 index 2 inet 192.168.15.105 netmask ffffff00 broadcast 192.168.15.255 ether 2:8:20:86:53:e3ip.tun0: flags=10010008d1 mtu 1366 index 3 inet tunnel src 192.168.15.105 tunnel dst 192.168.101.183 tunnel security settings --> use 'ipsecconf -ln -i ip.tun0' tunnel hop limit 60 inet 192.168.48.27 --> 192.168.76.43 netmask ffffffffThis demonstrates one of the features of Crossbow. I will now be able to do a lot more with zones, while taking advantage of IP Instances, without needing multiple NICs. This is great for customer demos. I have not covered items such as the virtual switch that is created, or the ability to snoop traffic between zones now, or all the resource monitoring and controls that Crossbow offers. More on that elsewhere and in the future. P.S. Crossbow affects and works with a lot of the generic LAN driver (GLD) framework, and delivers a new MAC interface, utilizes improvements in dladm, data link naming (vanity naming from Project Clearview), and lots more, and thus is a lot of code changes. There is a high level of interest in getting the VNIC features into Solaris 10. If you have a strong need for that, please add a Service Record using your support channel to Change Request 6790102.

With Solaris Express Community Edition build 105, the initial implementation of Network Virtualization and Resource Control, known as Project Crossbow, is delivered into the main networking code base...

Sun

How to BFU a System

Sometimes you want to try out a new feature not yet delivered into Solaris Nevada, and you have apply binaries using BFU. I imagine if you do this all the time, you know all the tricks and gotchas. I don't do it often enough and sometimes get caught up in some details. So here are the steps I tend to use. pre { padding: 1em; border: 1px dashed #2f6fab; color: black; background-color: #f9f9f9; line-height: 1.1em; }First, get the latest BFU package from the ON (OS/Net) Consolidation. I typically only use the SUNWonbld tar file for my hardware.Download the bits you want to install, such as those for Crossbow Beta or Clearview's snoop on loopbackTo make life a little simpler, I add the following to root's .profile file.if [ -d /opt/onbld ]then FASTFS=/opt/onbld/bin/`uname -p`/fastfs ; export FASTFS BFULD=/opt/onbld/bin/`uname -p`/bfuld ; export BFULD GZIPBIN=/usr/bin/gzip ; export GZIPBIN PATH=$PATH:/opt/onbld/binfiNow to apply the bits. After unpacking the bits into a temporary location, lets say /tmp/bfu, install the onbld package.# pkgadd -d onbld allProcessing package instance from OS-Net Build Tools(sparc) 11.11,REV=2008.03.18.14.39Copyright 2008 Sun Microsystems, Inc. All rights reserved.Use is subject to license terms....Installation of was successful.#I re-read my .profile, and verify that the necessary BFU variables are set# . /.profile# echo $FASTFS/opt/onbld/bin/sparc/fastfsNow apply the BFU (this one is for Crossbow beta). You must use the full pathname!Note: you may want to do this from the console, in case you loose your network connection.# bfu `pwd`/nightly-ndCopying /opt/onbld/bin/bfu to /tmp/bfu.1000Executing /tmp/bfu.1000 /tmp/bfu/nightly-nd...Entering post-bfu protected environment (shell: ksh).Edit configuration files as necessary, then reboot.bfu#Note that you end up in the BFU shell. Now issue an automatic conflict resolution check.bfu# /opt/onbld/bin/acrGetting ACR information from /tmp/bfu/nightly-nd... okupdating //platform/sun4v/boot_archiveFinished. See /tmp/acr.nhaqVi/allresults for complete log.bfu#bfu# exitExiting post-bfu protected environment. To reenter, type:LD_NOAUXFLTR=1 LD_LIBRARY_PATH=/tmp/bfulib LD_LIBRARY_PATH_64=/tmp/bfulib/64 PATH=/tmp/bfubin /tmp/bfubin/ksh#Its time to reboot and run with the new bits!

Sometimes you want to try out a new feature not yet delivered into Solaris Nevada, and you have apply binaries using BFU. I imagine if you do this all the time, you know all the tricks and gotchas....

Solaris Networking

Use Cases for Network Virtualization and Resource Control (Project Crossbow)

Network Virtualization and Resource Control, more often referred to as Project Crossbow, is in beta starting today. Some may wonder whether they should try the beta code, and if so, how to show the benefits Crossbow delivers. Here is a list of some use cases for Crossbow.Network VirtualizationRequirement: You need more NICs than are installed or supported on the system. Use zones with exclusive IP Instance, but share a single NIC or small number of NICs.Feature: Any crossbow supported NIC can now be split up into severalVNICs, and those VNICs can be assigned to different zones. Optionally,resource management can be applied to any or all VNICs.Benefit: Zones that need network administrative isolation can share asingle NIC. Traffic between zones with exclusive IP Instances can becontained within the system if the zones use VNICs on the same NIC.Resource management can be used to limit CPU or network bandwidthassociated with a zone by applying controls on a VNIC.How to Demonstrate:create zones if they don't existconfigure zones as ip-type=exclusivecreate VNICsassign VNICs to zonesboot zonesobserve distributed trafficoptionally apply resource controls and observeorcreate VNICsassign IP addresses to VNICsrun services bound to separate IP addressesobserve distributed trafficoptionally apply resource controls and observeNetwork Traffic ObservabilityRequirement: Need to measure and monitor network traffic for differentservices on the system.Feature: Bytes and packets received and transmitted can be counted andmonitored.Benefit: Better understanding of network traffic patterns, and potentialdata points to make future resource control decisions. Opportunity to dochargeback based on network usage.How to Demonstrate:create one or more VNICs using dladmcreate one or more flows using flowadmshow data in real-time using dladm or flowadmshow historical datashow for data link/NIC, VNIC, and flowNetwork Resource ManagementRequirement: Limit the amount of network bandwidth used by a service.Control which CPU(s) are used to process network traffic for a service.Feature: Limits on the maximum network traffic in bits/second can beset. Network traffic processing can be directed to one or more CPUs,providing for better response time for the network stack, or insuringthat network stack processing will not interfere with other resourceconsumers on the system.Benefit: Finer control of resource utilization. Ability to set qualityof service. Prevention of resource starvation by competing consumers.Denial of Service attack defense.How to Demonstrate:create one or more VNICs using dladmcreate one or more flows using flowadmset bandwidth caps on VNICs or flowsset CPU binding on VNICs or flowssee limits enforced under heavy network load by observing the application(s)' data throughput, for example, metrics fromwgetftpdladmflowadm statisticsyour own application metric(s)show different CPU utilization or distribution using mpstatNote: bandwidth guarantees are not available at this time.Network Performance ImprovementsRequirement: Faster network processing. More efficient network processing.Features: Improved datagram processing within the IP stack. Automaticswitching between interrupt and polling to speed packet processing andremove interrupt overhead.Benefit: Existing network applications will run faster, with lowerlatency, higher throughput, and more CPU available to other services.Not application changes are required.How to Demonstrate:Compare your application's performance differencesusing Solaris Nevada build 81 vs. Crossbow betausing Solaris 10 vs. Crossbow betaMeasure latency or throughput, depending on which is more important to your application, and also observe changes in CPU utilization.Improved IP ForwardingRequirement: Faster forwarding of IP datagrams.Feature: Faster forwarding of IP datagrams, especially asrouting/forwarding tables get large.Benefit: Solaris is a better platform for routers and firewalls.How to Demonstrate:Compare your router's performance differencesusing Solaris Nevada build 81 vs. Crossbow betausing Solaris 10 vs. Crossbow betaMeasure latency and throughput, and also observe differences in CPU utilization.Additional InfoNicolas' Private Virtual NetworkSunay's blog on network in a boxKarol's testing of Crossbow

Network Virtualization and Resource Control, more often referred to as Project Crossbow, is in betastarting today. Some may wonder whether they should try the beta code, and if so, how to show the...

Sun

Breaking the stdio(3C) 256 file descriptor barrier in Solaris' stdio

In the mid 1990s I was trying to show a customer that Solaris can host 10,000 web sites on a single system. This was in the days of 400MHz processors and 1GB of memory cost more than a luxury automobile. One thing I tried to was to use the Apache web server and virtual hosting. However, since the customer required separate logs for each web site, each virtual host needed at least one log file and thus a file descriptor. Apache uses fopen(3C) for these files. So I had to use on hundred web servers each hosting 100 web sites.Solaris has historically allowed only 256 stdio streams to be open, where the file descriptors are below 256. So applications can quickly run out of file descriptors when doing lots of fopen() calls. For 32-bit applications, it has not been possible to increase this limit, as it could cause binary compatibility issues for older applications (compatibility going back as far back as those compiled on SunOS 4.x). The dup(2) system call has been used to move other file descriptors above 256 to free up slots for fopen. But the application is still limited to a maximum of 256 stdio streams!With the release of Solaris 10 8/07 (often referred to as update 4), there is a new interface to extend the FILE facility. Programming details are in the man page enable_extended_FILE_stdio(3C). And if you don't want to make any code changes, extendedFILE(5) describes how to do this for existing applications and binaries.I am working with a customer who needs to host over 1,400 web sites. We are using portions of the coolstack, as well as customized versions of Apache and PHP. With virtual hosting, the setup quickly hit the 256 stdio file limit!With a small change to apachectl, it is now possible to host all 1,400+ web sites within a single instance of Apache. I added the following to the configuration section of apachectl:ulimit -n 3000LD_PRELOAD_32=/usr/lib/extendedFILE.so.1 ; export LD_PRELOAD_32 The ulimit -n 3000 increases the number of file descriptors a process can have open to 3000, up from the default of 256. Since apachectl is run as root, or with sufficient privileges using Role Bases Access Control, this is permitted.The LD_PRELOAD_32 setting allows me to have the library provide special versions of library functions or system calls. In this case, it does special things when fopen is called, and automatically uses dup(2) to free up the lower 256 file descriptors.The enable_extended_FILE_stdio(3C) man pages lists some of the requirements for an application to work well with this interposition library, such as not doing direct access into the fields of the FILE structure. Since Apache is using stdio for log files, it is unlikely that Apache is accessing the structures directly.Testing with the customer's configuration has Apache serving up all 1,400 web sites using a single instance of the httpd server! Cool, success at last!

In the mid 1990s I was trying to show a customer that Solaris can host 10,000 web sites on a single system. This was in the days of 400MHz processors and 1GB of memory cost more than a...

Solaris Networking

One Step Closer to IP Instances with ce

With the availability of Solaris Nevada build 80 [1], the ability to use IP Instances with the GigaSwift line of NICs and the ce driver becomes possible. The fix for CR 6616075 to zoneadmd(1M) has been integrated into the OpenSolaris code base and is available in build 80. The necessary fix to the ce driver, tracked in CR 6606507, has already been delivered. With this combination, a zone can have an exclusive IP Instance using a ce-based link.Zone configuration information:global# zonecfg -z ce1 info netnet: address not specified physical: ce1global#And the view from the non-global zone:ce1# zonenamece1ce1# cat /etc/release Solaris Express Community Edition snv_80 SPARC Copyright 2008 Sun Microsystems, Inc. All Rights Reserved. Use is subject to license terms. Assembled 17 December 2007ce1# ifconfig -alo0: flags=2001000849 mtu 8232 index 1 inet 127.0.0.1 netmask ff000000ce1: flags=1000843 mtu 1500 index 2 inet 192.168.200.153 netmask ffffff00 broadcast 192.168.200.255 ether 0:3:ba:68:1d:5flo0: flags=2002000849 mtu 8252 index 1 inet6 ::1/128ce1#More when the soak time in Nevada is complete and the backport to Solaris 10 is available.Thanks to the engineers who put energy into these fixes!Happy Holidays!Steffen[1] As of 20 December 2007, build 80 is available within Sun only. Availability on opensolaris.org will be announced on opensolaris-announce@opensolaris.org.

With the availability of Solaris Nevada build 80 [1], the ability to use IP Instances with the GigaSwift line of NICs and the ce driver becomes possible. The fix for CR 6616075 to zoneadmd(1M)has been...

Solaris Networking

What's Up ce-Doc?

There is a large install base of the 1Gigabit Ethernet network interface card called GigaSwift using the Cassini Ethernet (ce) driver. The GigaSwift interface is built onto the motherboard of a large number of systems, and has been the primary dual and quad GbE card available from Sun in PCI, PCI-X, and CPCI formats.Many of the users running systems with GigaSwift NICs are also interested in running zones with exclusive IP Instances. However, the ce drivers is a DLPI style-1 driver, not the GLDv3 driver required by IP Instances. Because of the large install base of the GigaSwift NICs, one consideration has been to convert the ce driver to GLDv3. The challenge is: since a lot of users of this NIC also tune its characteristics with ndd(1M), converting ce to GLDv3 would essentially eliminate those tunables. There is work in progress to provide a shim for non-GLDv3 drivers to make the work within the GLDv3 framework. This won't be delivered into Solaris Nevada or OpenSolaris until early next year, and then will need to be backported to Solaris 10.What do we do for all those users who are currently using ce in the meantime?Change Requests 6606507 and 6616075 are being worked on to support the ce driver in zones with exclusive IP Instance. CR 6616075 is for zoneadmd(1M) changes to issue an ioctl when an interface (the "physical" part of the net directive in the zone configuration) is not GLDv3. These are separate CRs because zoneadmd is in Solaris ON (the OS and Networking consolidation) while ce is outside of ON.The changes are ready to be put in Nevada and OpenSolaris, where the code will undergo a mandatory soak test period of four to six week. Once everything passes Nevada testing, and the changes are integrated into Solaris 10, the patch is created, tested, and issued.NOTE: Updated 10 December 2007 to correct the bug ID for the zoneadmd part.Updated 12 March 2008: The patches are now available. See the entry dated Wednesday Jan 30, 2008.

There is a large install base of the 1Gigabit Ethernet network interface card called GigaSwift using the Cassini Ethernet (ce) driver. The GigaSwift interface is built onto the motherboard of a large...

Solaris Networking

Using IP Instances with VLANs or How to Make a Few NICs Look Like Many

[Minor editorial and clarification updates 2009.09.28] Solaris 10 8/07 includes a new feature for zone networking. IP Instances is the facility to give a non-global zone its own complete control over the IP stack, which previously was shared with and controlled by the global zone.A zone that has an exclusive IP Instance can set interface parameters using ifconfig(1M), put an interface into promiscuous mode to run snoop(1M), be a DHCP client or server, set ndd(1M) variables, have its own IPsec policies, etc.One requirement for an exclusive IP Instance is that it must have exclusive access to a link name. This is any NIC, VLAN-tagged NIC component, or aggregation at this time. When they become available, virtual NICs will make this much simpler, as a single NIC can be presented to the zones using a number of VNICs, effectively multiplexing access to that NIC. A link name is an entry that can be found in /dev, such as /dev/bge0, /dev/bge321001 (VLAN tag 321 on bge1), aggr2, and so on.To see what link names are available on a system, use dladm(1M) with the show-link option. For example:global# dladm show-linkbge0 type: non-vlan mtu: 1500 device: bge0bge1 type: non-vlan mtu: 1500 device: bge1bge2 type: non-vlan mtu: 1500 device: bge2bge3 type: non-vlan mtu: 1500 device: bge3As folks have started to use IP Instances to isolate their zones, they have noticed that they don't have sufficient link names (I'll use just link in the rest of this) to assigned to the zones that have or wish to configure as exclusive. So, how does a global zone administrator configure a large number of zones as exclusive?Let's consider the following situation, where there are three tiers of a web service, where each tier is on a different network.If each server has only one NIC, the total number of switch ports required is at least eight (8). If each server has a management port, that is another eight ports, even if they are on a different, management network. Add to that at least three three switch ports going to the router.Consolidating the servers onto a single Solaris 10 instance using exclusive IP Instances requires at least eight NICs for the services (one per service), and at least one for the global zone and management. (We'll ignore a service process requirements, since they are separate anyway, and access could be either via a serial interface or a network.)One option to consider is using VLANs and VLAN tagging. When using VLAN tagging, additional information is put onto the ethernet frame by the sender which allows the receiver to associated that frame to a specific VLAN. The specification allow up to 4094 VLAN tags, from 1 to 4094. For more information on administering VLANs in Solaris 10, see Administering Virtual Local Area Networks in the Solaris 10 System Administrator Collection.VLANs is a method to collapse multiple ethernet broadcast domains (whether hubs or switches) into a single network unit (usually a switch). [Typically, a single IP subnet, such as 192.168.54.0/24, is on a broadcast domain. Within such a switch frame, you can have a large number of virtual switches, consolidating network infrastructure and still isolating broadcast domains. Often, the use of VLANs is completely hidden from the systems tied to the switch, as a port on the switch is configured for only one VLAN. With VLAN tagging, a single port can allow a system to connect to multiple VLAns, and therefore multiple networks. Both the switch and the system must be configured for VLAN tagging for this to work properly. VLAN tagging has been used for years, and is robust and reliable.Any one network interface can have multiple VLANs configured for it, but a single VLAN ID can only exist once on each interface. Thus it is possible to put multiple networks or broadcast domains on a single interface. It is not possible to put more than one VLAN of any broadcast domain on a single interface. For example, you can put VLANs 111, 112, and 113 on interface bge1, but you can not put VLAN 111 on bge1 more than once. You can, however, put VLAN 111 on interfaces bge1 and bge2.Using the case shown above, if the three web servers are on the same network, say 10.1.111.0/24, you would want to have three interfaces that are all connected to a VLAN capable switch, and configure each interface with a VLAN tag that is the same as the VLAN ID on the switch.For example, if the VLAN tag is 111 and the interfaces are bge1 through bge3, the link names you would assign to the three web servers would be bge111001, bge111002, and bge111003.Introducing zones into the setup, the web servers can be run in three separate zones, and with exclusive IP Instances, they can be totally separate and each assigned a VLAN-tagged interface. Web Server 1 could have bge111001, Web Server 2 could have bge111002, and Web Server 3 could have bge111003.global# zonecfg -z web1 info netnet: address not specified physical: bge111001global# zonecfg -z web2 info netnet: address not specified physical: bge111002global# zonecfg -z web3 info netnet: address not specified physical: bge111003Within the zones, you could configure IP addresses 10.1.111.1/24 through 10.1.111.3/24.Similarly, for the authentication tier, using VLAN ID 112, you could assign the zones auth1 through auth3 to bge112001, bge112002, and bge112003,respectively. And for application servers app1 and app2 on VLAN ID 113, bge113001 and bge113002. This can be repeated until some limit is reached, whether it is network bandwidth, system resource limits, or the maximum number of concurrent VLANs on either the switch or Solaris.This configuration could look like the following diagram.Web Server 1, Auth Server 1, and Application Server 1 share the use of NIC1, yet are all on different VLANs (111, 112, and 113, respectively). The same for instances 2 and 3, except that there is no third application server. All traffic between the three web servers will stay within the switch, as will traffic between the authentication servers. Traffic between the tiers is passed between the IP networks by the router. NICg is showing that the global zone also has a network interface.Using this technique, the maximum number of zones with exclusive IP Instances you could deploy on a single system that are on the same subnet is limited to the number of interfaces that are capable of doing VLAN tagging. In the above example, with three bge interfaces on the system, the maximum number of exclusive zones on a single subnet would be three. (I have intentionally reserved bge0 for the global zone, but it would be possible to use it as well, making sure the global zone uses a different VLAN ID altogether, such as 1 or 2.)

[Minor editorial and clarification updates 2009.09.28] Solaris 10 8/07 includes a new feature for zone networking. IP Instances is the facility to give a non-global zone its own complete control over...

Solaris Networking

In two places at once?

Some background. Like any other mobile workforce, Sun employees have a need to access internal network services while not in the office. While we use commercial products, Sun engineers have also been working on a \*product\* called punchin. Punchin is a Sun-created VPN technology that uses native IPsec/IKE from the operating system in which it runs. It is the primary Solaris VPN solution for Solaris servers and clients, and will be expanding to other operating systems such as MacOS X in the near future. Security policy states that if a system is 'punched in', it must not be on the public network at the same it. In other words, while the VPN tunnel is up, access to the Internet directly is restricted, especially access from the Internet to the system. While a system is on the VPN, it can not also be your Internet facing personal web server, for example. Bringing up the VPN is an interactive process, requiring a challenge/response sequence. If you are like me, you may have a system at home and while at work need to access from that system some data on the corporate network. This is a catch-22, since the connection you use remotely to activate the VPN breaks as you start the VPN establishment process (enforcing the policy of being on only one network at a time). Introduce Solaris Containers, or zones. Each zone looks like its own system. However, they share a single kernel and single IP. But wait, there is this new thing called IP Instance that allows zones configured as having an exclusive IP Instance to have their own IP (they already have their own TCP and UDP for all practical purposes). And wouldn't it be great if I could do this with just one NIC? Hey, Project Crossbow has IP Instances and VNICs. Great! Now for the reality check. As I was told not so long ago, Rome was not built in one day. IP Instances are in Solaris Nevada and targeted for Solaris 10 7/07. VNICs are only available in a snapshot applied via BFU to Nevada build 61. [See also Note 1 below.]So, lets see how to do this with just IP Instances. First, since each instance, which are at least the global zone and one non-global need their own NIC, I need at least two NICs. Not all NICs support IP Instances, so the one(s) for the non-global zone(s) need to support IP Instances, and thus must be using GLDv3 drivers. In my case, I am using a Sun Blade 100 with an on-board eri 100Mbps Ethernet interface. I purchased an Intel 1000/Pro MT Server NIC, which requires an e1000g driver. Here is a list of NICs that are known to work with IP Instances and VNICs.After installing Solaris Nevada, I created my non-global zone with the following configuration:global# zonecfg -z vpnzone infozonename: vpnzonezonepath: /zones/vpnzonebrand: nativeautoboot: truebootargs: pool: limitpriv: scheduling-class:ip-type: exclusiveinherit-pkg-dir: dir: /libinherit-pkg-dir: dir: /platforminherit-pkg-dir: dir: /sbininherit-pkg-dir: dir: /usrinherit-pkg-dir: dir: /etc/crypto/certsfs: dir: /usr/local special: /zones/vpnzone/usr-local raw not specified type: lofs options: []net: address not specified physical: e1000g0global#I had to include an additional inherit directive for this sparse, because currently some of the crypto stuff is not duplicated into a non-global zone. Without this, even the digest command would fail, for example. I needed to provide a private directory for /usr/local since that is where the Punchin packages get installed by default. Once I installed and configured vpnzone, I was able to install and configure the Punchin client. However, this required two NICs. So to use just one, I created a VNIC for my VPN zone.global# dladm show-deveri0 link: unknown speed: 0Mb duplex: unknowne1000g0 link: up speed: 100Mb duplex: fullglobal# dladm show-linkeri0 type: legacy mtu: 1500 device: eri0e1000g0 type: non-vlan mtu: 1500 device: e1000g0global# dladm create-vnic -d e1000g0 -m 0:4:23:e0:5f:1 1global# dladm show-linkeri0 type: legacy mtu: 1500 device: eri0e1000g0 type: non-vlan mtu: 1500 device: e1000g0vnic1 type: non-vlan mtu: 1500 device: vnic1global# ifconfig -alo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 inet 127.0.0.1 netmask ff000000e1000g0: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 2 inet 192.168.1.58 netmask ffffff00 broadcast 192.168.1.255 ether 0:4:23:e0:5f:6bglobal# I chose to provide my on MAC address, based on the address of the base NIC. I modified the non-global zone configuration:global# zonecfg -z vpnzone infozonename: vpnzonezonepath: /zones/vpnzonebrand: nativeautoboot: truebootargs: pool: limitpriv: scheduling-class:ip-type: exclusiveinherit-pkg-dir: dir: /libinherit-pkg-dir: dir: /platforminherit-pkg-dir: dir: /sbininherit-pkg-dir: dir: /usrinherit-pkg-dir: dir: /etc/crypto/certsfs: dir: /usr/local special: /zones/vpnzone/usr-local raw not specified type: lofs options: []net: address not specifiedphysical: vnic1global#Now I can access the system at home while I am not there, zlogin into vpnzone, punchin, and be connected to our internal network. This is really significant for me, since at home I have 6Mbps download compared to only 600Kbps in the office. So downloading the DVD ISO that I used to create this setup took 1/10th the time at home than at work. [1] I also used the SUNWonbld package. This package is specific to build 61!Because I install BFUs a lot, I have added the following to my .profileif [ -d /opt/onbld ]then FASTFS=/opt/onbld/bin/`uname -p`/fastfs ; export FASTFS BFULD=/opt/onbld/bin/`uname -p`/bfuld ; export BFULD GZIPBIN=/usr/bin/gzip ; export GZIPBIN PATH=$PATH:/opt/onbld/binfi

Some background. Like any other mobile workforce, Sun employees have a need to access internal network services while not in the office. While we use commercial products, Sun engineers have also been...

Solaris Networking

Network performance differences within an IP Instance vs. across IP Instances

When consolidating or co-locating multiple applications on the same system, inter-application network typically stays within the system, since the shared IP in the kernel recognizes that the destination address is on the same system, and thus loops it back up the stack without ever putting the data on a physical network. This has introduced some challenges for customers deploying Solaris Containers (specifically zones) where different Containers are on different subnets, and it is expected that traffic between them leaves the system (maybe through a router or fireall to restrict or monitor inter-tier traffic). With IP Instances in Solaris Nevada build 57 and targeted for Solaris 10 7/07, there is the ability to configures zones with exclusive IP Instances, thus forcing all traffic leaving a zone out onto the network. This introduces additional network stack processing both on the transmit and the receive. Prompted by some customer questions regarding this, I performed a simple test to measure the difference. On two systems, a V210 with two 1.336GHz CPUs and 8GB memory, and an x4200 with two dual-core Opteron XXXX and 8GB memory, I ran FTP transfers between zones. My switch is a Netgear GS716T Smart Switch with 1Gbps ports. The V210 has four bge interfaces and the x4200 has four e1000g interfaces. I created four zones. Zones x1 and x2 have eXclusive IP Instances, while zones s1 and s2 have Shared IP Instances (IP is shared with the global zone). Both systems are running Solaris 10 7/07 build 06. Relevant zonecfg info is a follows (all zones are sparse): v210# zonecfg -z x1 infozonename: x1zonepath: /localzones/x1...ip-type: exclusivenet: address not specified physical: bge1v210# zonecfg -z s1 infozonename: s1zonepath: /localzones/s1...ip-type: sharednet: address: 10.10.10.11/24 physical: bge3 As a test user in each zone, I created a file using 'mkfile 1000m /tmp/file1000m'. Then I used ftp to transfer it between zones. No tuning was done whatsoever. The results are as follows. V210: (bge)Exclusive to Exclusivex1# /usr/bin/time ftp x2

When consolidating or co-locating multiple applications on the same system, inter-application network typically stays within the system, since the shared IP in the kernel recognizes that...

Oracle

Integrated Cloud Applications & Platform Services