X

Solaris News, Views, and Real-World Experiences from the Field

Recent Posts

Coming to Oracle OpenWorld? Hope to see you there.

Oracle OpenWorld 2017 is right around the corner.  It starts this Sunday, October 1, at the Moscone Center in downtown San Francisco.  This is a great opportunity to see what's new and exciting from all parts of Oracle - Software, Hardware, Cloud, Services, everything. If you are coming, build your session grid here: https://events.rainfocus.com/catalog/oracle/oow17/catalogoow17 I will be leading two sessions this year.  The first is a presentation with Peter Wilson from the Oracle SPARC Cloud team.  The second is a hands-on lab with Salman Ashfaq of the Oracle Solution Center. CON6296: Lift and Shift your Oracle Solaris Workloads to the Oracle Cloud In this session learn how easy it is to move your Oracle Solaris–based workloads to the Oracle Cloud. Hear about the best practices required to ensure that you can securely deploy your applications and data on Oracle SPARC–based cloud solutions, and what you need to know to prepare for the migration. Some of the considerations for networking and storage are also discussed, as well as how to transition from a fully on-premises workload to public or hybrid cloud. Time: Monday, Oct 02, 5:45 p.m. - 6:30 p.m. Location: Marriott Marquis (Yerba Buena Level) - Salon 13 This session will share some of the details of the Oracle SPARC Compute Cloud and share customer experiences that make it simple and easy to migrate your existing workloads directly into the SPARC cloud. HOL7783: Orchestrate an Oracle Database in Minutes with Ansible and Oracle Solaris Being able to quickly orchestrate an application is a critical element of any successful cloud deployment. In this hands-on lab walk through orchestrating an Oracle Database instance in Oracle Solaris Zones using Ansible playbooks. Learn to reliably automate your operations and eliminate repetitive tasks. Time: Tuesday, Oct 03, 4:00 p.m. - 6:00 p.m. Location: Hilton San Francisco Union Square (Ballroom Level) - Continental Ballroom 3 This lab will give a brief overview of Ansible and Ansible on Solaris.  Then, we will lead the students through the process of creating a new VM to host a database, prepare the VM to accommodate the Oracle Database, and finally to deploy the Oracle Database into that VM.  You will see that Ansible can be used to automate complicated, multi-tier deployments in a consistent, standardized way. I'm looking forward to seeing you all there and having a full house for not only these but all the other Systems and OS sessions.  

Oracle OpenWorld 2017 is right around the corner.  It starts this Sunday, October 1, at the Moscone Center in downtown San Francisco.  This is a great opportunity to see what's new and exciting from...

Installing Ansible on Solaris 11.3 (Updated)

Lately, I have been involved in a number of customer projects involving Ansible to manage various aspects of a Solaris environment, notably on the SPARC Model 300 in the Oracle Compute Cloud. Since Ansible does not ship with Solaris and is not natively in the Solaris repository of free or open source software, we have to add it to the system.  This blog shows how to do that. Ansible is a Python-based tool for configuration management and configuration of systems of all sorts.  Rather than relying on an agent to carry out its actions (like Puppet or Chef), Ansible relies on its ability to ssh into a system to carry out its work.  I have found Ansible easy to learn, easy to use, and easy to extend to do what I need.  More about some of that in subsequent posts. Since Ansible replies on Python, one might choose to just use pip to install it, and that's fine.  But, it seemed to me that using as much as possible of the installed and supported Python modules and libraries delivered with Solaris would lead to an overall more stable experience.  It removes the need to use Python's virtualenv capability or to have to deal with version conflicts between the OS-delivered libraries and Ansible's requirements. Often, there might be an Ansible management node within an environment.  This can be for centralization, key management, security, or whatever.  In my case, I created a separate kernel zone to act as my Ansible launching pad.  Nothing special about creating this kernel zone.  I just did a regular zonecfg create for the kernel zone.  When I installed it, I used my own manifest and system configuration profile. The key bit of the manifest was what packages to install.  By default, a kernel zone is built with solaris-small-server, which is sufficient.  But here there are a number of other required packages.  Notable, you need a C compiler and you need several pieces from the developer/gnu package.  (I believe that the whole package is not required, but it turned into too much trouble to figure out which pieces I could get rid of, so I kept the whole thing.) Ansible works much better with OpenSSH than with the older SunSSH shipped with Solaris.  So, we include OpenSSH in our manifest.  If you wanted to avoid the SunSSH, you could use  <software_data action="avoid"> to avoid installing it if you wanted.  Instead, once I have installed the kernel zone, I just use # pkg set-mediator -I openssh ssh to select openssh as the default implementer for ssh. Then, there are a number of Python libraries required for Ansible.  If you were to just use pip to install Ansible, it would pull all of these down from PyPi.  But, I especially wanted to use as much from Solaris as possible.  In this way, I can avoid some of the version mismatches with other Python utilities on the system.  Turns out for base Ansible there are just these few libraries.  I also experimented with the Ansible OpenStack libraries.  These install nicely and work, but they do break some other parts of Solaris in doing so.  More about that in another blog post.  For now, just add these libraries and Ansble installs nicely.         <software_data action="install">         <name>pkg:/entire@0.5.11-0.175.3.18</name>         <name>pkg:/group/system/solaris-small-server</name>         <name>pkg:/developer/gcc</name>         <name>pkg:/developer/gnu</name>         <name>pkg:/network/openssh</name>         <name>pkg:/library/security/openssl/openssl-fips-140</name>         <name>pkg:/library/python/cffi</name>         <name>pkg:/library/python/jinja2</name>         <name>pkg:/library/python/pyyaml</name>         <name>pkg:/library/python/paramiko</name>         <name>pkg:/library/python/setuptools</name>         <name>pkg:/library/python/pyasn1</name>         <name>pkg:/library/python/cryptography</name>         <name>pkg:/library/python/six</name>         <name>pkg:/library/python/ipaddress</name>         <name>pkg:/library/python/pycparser</name>       </software_data> In my system configuration profile, I did nothing that was not very stock.  You could use sysconfig interactively if you preferred.  All it does is assign IP addresses, system identity, and the initial users in my case, at least for the Ansible management node. EDIT - 1 Sep 2017 Of course, I left something out before.  In order to make the installation work,  you need to provide some options to gcc and make sure that the builder can find the compiler that you already installed. You need to do this: # ln -s /usr/bin/gcc /usr/bin/cc # export CC=gcc # export CFLAGS="-I/usr/include/gmp -I/usr/lib/libffi-3.0.9/include -I/usr/include/openssl/fips-140" END of EDIT Once the kernel zone (or non-global zone or LDom or bare metal - doesn't really matter) is built, log into it and actually install Ansible.  You still use pip for this, since that's the best way to get ansible installed.   root@wkshpvm03:~# pip install ansible Collecting ansible   Downloading ansible-2.3.2.0.tar.gz (4.3MB)     100% |████████████████████████████████| 4.3MB 140kB/s Requirement already satisfied: jinja2 in /usr/lib/python2.7/vendor-packages (from ansible) Requirement already satisfied: PyYAML in /usr/lib/python2.7/vendor-packages (from ansible) Requirement already satisfied: paramiko in /usr/lib/python2.7/vendor-packages (from ansible) Collecting pycrypto>=2.6 (from ansible)   Downloading pycrypto-2.6.1.tar.gz (446kB)     100% |████████████████████████████████| 450kB 1.3MB/s Requirement already satisfied: setuptools in /usr/lib/python2.7/vendor-packages (from ansible) Requirement already satisfied: markupsafe in /usr/lib/python2.7/vendor-packages (from jinja2->ansible) Requirement already satisfied: cryptography>=0.8 in /usr/lib/python2.7/vendor-packages (from paramiko->ansible) Requirement already satisfied: pyasn1>=0.1.7 in /usr/lib/python2.7/vendor-packages (from paramiko->ansible) Requirement already satisfied: idna>=2.0 in /usr/lib/python2.7/vendor-packages (from cryptography>=0.8->paramiko->ansible) Requirement already satisfied: six>=1.4.1 in /usr/lib/python2.7/vendor-packages (from cryptography>=0.8->paramiko->ansible) Requirement already satisfied: enum34 in /usr/lib/python2.7/vendor-packages (from cryptography>=0.8->paramiko->ansible) Requirement already satisfied: ipaddress in /usr/lib/python2.7/vendor-packages (from cryptography>=0.8->paramiko->ansible) Requirement already satisfied: cffi>=1.4.1 in /usr/lib/python2.7/vendor-packages (from cryptography>=0.8->paramiko->ansible) Requirement already satisfied: pycparser in /usr/lib/python2.7/vendor-packages (from cffi>=1.4.1->cryptography>=0.8->paramiko->ansible) Installing collected packages: pycrypto, ansible   Running setup.py install for pycrypto ... done   Running setup.py install for ansible ... done Successfully installed ansible-2.3.2.0 pycrypto-2.6.1   You see that most of the prerequisites got satisfied by the packages we installed in the AI manifest.  PyCrypto and Ansible itself were the only things that had to be added, and neither of them ships with Solaris. At this point, ansible is ready to go.  Set up your ansible.cfg and hosts files to suit your needs and jump in.  root@wkshpvm04:~# ansible -m ping wkshpvm02 /usr/lib/python2.7/site-packages/Crypto/Util/number.py:57: PowmInsecureWarning: Not using mpz_powm_sec. You should rebuild using libgmp >= 5 to avoid timing attack vulnerability. _warn("Not using mpz_powm_sec. You should rebuild using libgmp >= 5 to avoid timing attack vulnerability.", PowmInsecureWarning) wkshpvm02 | SUCCESS => { "changed": false, "ping": "pong" } Happy Ansible-ing!  More on my adventures with Ansible on future posts.

Lately, I have been involved in a number of customer projects involving Ansible to manage various aspects of a Solaris environment, notably on the SPARC Model 300 in the Oracle Compute Cloud. Since...

Solaris

Solaris Essentials - One Day Class from Oracle University

Recently, I put together a one-day, video-on-demand class with the folks in Solaris Engineering and Oracle University that I wanted to let folks know about. This class, called Oracle Solaris Essentials for Experienced Administrators, is targeted toward folks who have not had a lot of experience with Solaris 11, but who are comfortable administering some other system - Solaris 10, various Linux systems, AIX, etc.  There are ten video segments that cover a day in the life of an administrator.  The general idea is that your manager has handed you a new Solaris 11 box and asked you to get it ready for use.  This class steps you through the basics: Configuring the hardware and getting the system on the network via the ILOM Initial system identification and sysconfig Managing OS file systems with ZFS Managing Solaris 11 packaging and patching using the new Image Packaging System Working with Solaris Boot Environments for safe and fast system updates Managing networks with Solaris 11 network virtualization Creating virtualized environments with Solaris zones. A few minutes of the wealth of observability tools in Solaris When you get to the landing page at http://education.oracle.com/pls/web_prod-plq-dad/db_pages.getpage?page_id=904&get_params=cloudId:307,seriesId:20551  (I know, it's a long URL.  Just click on the link instead.) there is a short trailer for the class at the top of the page.  There is also a button to add this to your Oracle University training subscription.  Training subscriptions are a great and cost effective way to consume all of the training you want within a year, but it's not what you want right now. Scroll down the page to the 10 individual videos.  These will all play without any sort of login to Oracle University or subscription.  Each segment will, however, point you toward the Oracle University classes that might provide more information about this topic. I had a great time making this class and hope you find it beneficial.

Recently, I put together a one-day, video-on-demand class with the folks in Solaris Engineering and Oracle University that I wanted to let folks know about. This class, called Oracle Solaris Essentials...

Solaris

Configuring multiple IP interfaces in a System Configuration Profile

Here's something that I missed when it came out in a Solaris 11 update and just found this week.  You can configure multiple network interfaces in the System Configuration profile for either an AI installed system or for a zone. I always thought that when I used an AI system configuration profile to configure network interfaces, I was limited to configuring only the first address, but that's not the case.  You can actually configure multiple interfaces in the profile.  The big benefit to this is you may be able to avoid having to have a finish script or any sort of after-installation mechanism just to configure network interfaces.  Of course, you have to know the names of the network interfaces.  For the built-in interfaces, this is easy.  For additional interfaces on NIC cards, it may not be quite as simple.  For zones, since you can define the names of every interface, it is always seasy. The Old Way - Single Interface Configuration You have always been able to use sysconfig to create a system configuration profile that provided the identity to an OS instance being installed: hostname, IP addresses, terminal type, locale, timezone, etc.  Often, you might use sysconfig just once and then edit the XML for each subsequent installation, using installadm criteria to select which profile you might want to use.  You could also make use (when using the Automated Installer - AI) to use macros to generalize the configuration and use the values passed through the boot process. So, the section of the system configuration profile to configure both IPv4 and IPv6 addresses on net0 would look like this: (I have left out the service_bundle section at the top and bottom of the XML and am just showing the particular services used for network configuration).   <service version="1" type="service" name="system/identity">    <instance enabled="true" name="node">      <property_group type="application" name="config">        <propval type="astring" name="nodename" value="{{AI_HOSTNAME}}"/>      </property_group>    </instance>  </service>  <service version="1" type="service" name="network/install">    <instance enabled="true" name="default">      <property_group type="application" name="install_ipv6_interface">        <propval type="astring" name="stateful" value="yes"/>        <propval type="astring" name="address_type" value="addrconf"/>        <propval type="astring" name="name" value="net0/v6"/>        <propval type="astring" name="stateless" value="yes"/>      </property_group>      <property_group type="application" name="install_ipv4_interface">        <propval type="net_address_v4" name="static_address" value="{{AI_IPV4}}/{{AI_IPV4_PREFIXLEN}}"/>        <propval type="astring" name="name" value="{{AI_NETLINK_VANITY}}/v4"/>        <propval type="astring" name="address_type" value="static"/>        <propval type="net_address_v4" name="default_route" value="{{AI_ROUTER}}"/>      </property_group>    </instance>  </service> You can see the way that the {{AI_HOSTNAME}} macro is used to set the nodename of the server in the system/identity service, as well as the various pieces of the IPv4 address in the network/install service.  Take a look at the Solaris installation manual in the section called Using System Configuration Profile Templates. The key parts of this profile are the two property groups for install_ipv4_interface and install_ipv6_interface.  In each property group, you can assign the properties with the necessary values to configure the network.  If you choose not to have an IPv6 interface, just eliminate this property group in the profile and it will not be configured (except for the loopback, which requires an IPv6 address).   With these property groups, you can configure any single interface - one for IPv6 and one for IPv4 - but you can't just replicate these property groups to configure additional interfaces. The New Way - Multiple Interface Configuration As it turns out, you actually can configure multiple interfaces within the SC profile.  You just need additional property groups for each interface you choose to configure.  In Solaris 11, SMF was significantly extended.  One of the extensions allowed for new property groups and properties to be defined in a profile dynamically.  Previously, they were all statically defined in the service manifest. Now, in addition to the install_ipv6_interface and install_ipv4_interface property groups, you can add additional property groups.  Each one has to be of type ipv6_interface or ipv4_interface and can be named with any valid property group name.  Each of these property groups uses the same set of properties as their corresponding pre-defined group.  This means that property groups of type ipv4_interface use the same properties as the install_ipv4_interface property group, and likewise for IPv6.   <service version="1" type="service" name="network/install">    <instance enabled="true" name="default">      <property_group type="application" name="install_ipv6_interface">        <propval type="astring" name="stateful" value="yes"/>        <propval type="astring" name="address_type" value="addrconf"/>        <propval type="astring" name="name" value="net0/v6"/>        <propval type="astring" name="stateless" value="yes"/>      </property_group>      <property_group type="ipv6_interface" name="install_ipv6_addr">        <propval type="net_address_v6" name="static_address" value="2620:0160:daef:0001::11/64"/>        <propval type="astring" name="name" value="net0/v6static"/>        <propval type="astring" name="address_type" value="static"/>        <propval type="net_address_v6" name="default_route" value="2620:0160:daef:0001::1/64"/>      </property_group>      <property_group type="application" name="install_ipv4_interface">        <propval type="net_address_v4" name="static_address" value="10.80.162.150/24"/>        <propval type="astring" name="name" value="net0/v4"/>        <propval type="astring" name="address_type" value="static"/>        <propval type="net_address_v4" name="default_route" value="10.80.162.1"/>      </property_group>      <property_group type="ipv4_interface" name="install_ipv4_net1">        <propval type="net_address_v4" name="static_address" value="10.80.163.150/24"/>        <propval type="astring" name="name" value="net1/v4"/>        <propval type="astring" name="address_type" value="static"/>      </property_group>    </instance>  </service> In the above example, we use this capability in two different ways.  First, we assign an IPv4 address to both net0 and net1, using the same address on two different subnets.  This is probably the most common use case.  So long as you know the name of the interface where you want to assign an address, you can configure several network interfaces this way at system configuration time.  This removes the need to have a first-boot service and package just to configure the network, or to use Puppet or some other after-installation tool to manage the configuration. Notice, that we did not use the AI macros in this example.  First, there are no macros that would help define the second interface.  Second, in a case where you are configuring multiple interfaces, chances are that you will need much more precise control than the AI macros provide - different subnets, specific interfaces, etc.  This means that you will be more likely either to use installadm criteria to select a particular profile for a host or to use a derived manifest to create the manifest and profile at install-time. For the IPv6 interface, we want to assign a global scope static address.  There has to be a link-local scope address in place before a global scope address can be assigned.  As it turns out, ipadm does not create the link-local address when you create a static address.  So, we have to use addrconf since it creates the link-local address on the interface.  Then, we can define a second address that is the global scope static address that we really wanted. So, you can see, that this small feature really does help and makes it a lot easier to create both VMs and bare-metal systems configured just the way you need right from their initial installation.

Here's something that I missed when it came out in a Solaris 11 update and just found this week.  You can configure multiple network interfaces in the System Configuration profile for either an AI...

OpenStack

OpenStack Juno - Flat Networking and Fixed IP Addresses

OpenStack Juno was recently released as a part of Solaris 11.3 Beta. There have been a number of very good blogs around the new capabilities of OpenStack Juno and how they relate to Solaris 11.  Networking within OpenStack is one of the areas where a number of people struggle, since it can get pretty complicated pretty quickly.  I always rely on the blogs that Girish has done as my source of truth for OpenStack networking in Solaris. OpenStack Neutron has thus far delivered networks based either on VLANs or VXLANs.  Neutron, using the Solaris Elastic Virtual Switch, creates new internal, non-routed networks over either VLANs or VXLANs, whichever is prefered.  The internal network provides strong network isolation for tenants from each other so that traffic does not get leaked from one tenant to another.  They also provide the flexibility to create whatever kind of network infrastructure is required within a tenant and keep it private to the tenant. Once the tenant wants packets to get out to the rest of the world, Neutron provides floating IP addresses for that purpose.  Just a quick word on nomenclature.  The address that Neutron assigns to a guest is its fixed address, even though it is usually assigned via DHCP.  This address is private to the tenant and may not even be routed beyond its internal subnet.  A floating IP address is an address assigned from a pool of externally accessible addresses that is mapped to a particular fixed address.  In Solaris, using the EVS, this is implemented with a bidirectional NAT.  The notion is that there are probably only a fraction of the guests that really need to communicate with hosts outside their tenant, and even fewer of them require inbound connections and a specific IP address.  So the pool of floating IP addresses is typically far smaller than the pool of fixed IP addresses. The idea of floating IP addresses and fixed IP addresses and VLANs or VXLANs to separate tenants is quite powerful, especially in a multi-tenant cloud world.  However, many of the enterprise customers that I deal with have existing networking infrastructures, firewall rules external to the cloud, monitoring tools, and a host of network procedures that make this ad hoc, flexible networking somewhat of a challenge. Flat Networks With OpenStack Juno in Solaris 11.3 (or Solaris 11.2 SRU 10 and newer), flat networking is also available.  Flat networks, in the parlance of OpenStack and the EVS are untagged networks, typically already existing infrastructure, where you want to directly apply OpenStack guests.  Only a single flat network per Neutron environment is supported at present.  But, this provides a way to provision OpenStack guests directly to an existing network infrastructure without the requirement of using floating IP addresses and NAT to access the outside world. This short blog will show you how to configure a flat network and provision guests to it.  Additionally, it will show you how to take back control over IP addresses and MACs.  Typically OpenStack relies on DHCP, but in many enterprise environments, both MACs and IP addresses are tightly controlled for a variety of reasons. Initial Configuration We will start with the assumption that you already have a working OpenStack.  This might be single-node cloud based on the Solaris 11 UAR, or based on some of the other blogs and documents in OTN.  But, we presume that you already have OpenStack installed and configured.  We will work with the command line rather than through the Horizon BUI for this exercise.  Horizon is a nice, simple BUI, but it hardly exposes all of the capabilities of OpenStack without customization.  Through the command line, we can be much more precise in what we do. The system I have used for these examples is a single-node OpenStack running within a Virtual Box.  Expanding this to a multi-node configuration is a straight-forward exercise (once you have a multi-node configuration established). I have already created a keypair within OpenStack called root.  This is just the public key for the root user in the global zone where the cloud has been defined.  It could be any key that would allow me to log into my guests. I have also imported a generic Unified Archive to use to launch new guests.  Since I am in a VirtualBox, I can only use Non-Global Zones, so I did not include the install media in this UAR. I created a zone and created an archive of it.  I then used glance to import it. Formatting note: For some reason, I cannot get Roller to deal well with vertical bar characters that OpenStack uses in its tables.  In order to display the output, I have replaced all vertical bars with exclamation points in order to maintain the tables.  Doesn't change the contents, just a little bit of the display.  root@openstack:~# export OS_USERNAME=glance OS_PASSWORD=glance \ OS_TENANT_NAME=service OS_AUTH_URL=http://localhost:5000/v2.0root@openstack:~# glance image-list+----+------+-------------+------------------+------+--------+! ID ! Name ! Disk Format ! Container Format ! Size ! Status !+----+------+-------------+------------------+------+--------++----+------+-------------+------------------+------+--------+root@openstack:~# glance image-create --container-format bare --disk-format raw \ --is-public true --name "basezone" --property architecture=x86_64 \ --property hypervisor-type=solariszones \ --property vm_mode=solariszones < /root/basezone.uar +----------------------------+--------------------------------------+! Property ! Value !+----------------------------+--------------------------------------+! Property 'architecture' ! x86_64 !! Property 'hypervisor_type' ! solariszones !! Property 'vm_mode' ! solariszones !! checksum ! 6b4f8da111c756029026f590e4c45f75 !! container_format ! bare !! created_at ! 2015-07-16T15:39:04 !! deleted ! False !! deleted_at ! None !! disk_format ! raw !! id ! 7dea7644-7013-42d8-98b7-e778f13f1389 !! is_public ! True !! min_disk ! 0 !! min_ram ! 0 !! name ! basezone !! owner ! 57c37eb8e904495f9b94ce84290ffcc9 !! protected ! False !! size ! 382597120 !! status ! active !! updated_at ! 2015-07-16T15:39:22 !! virtual_size ! None !+----------------------------+--------------------------------------+root@openstack:~# glance image-list+--------------------------------------+----------+-------------+------------------+-----------+--------+! ID ! Name ! Disk Format ! Container Format ! Size ! Status !+--------------------------------------+----------+-------------+------------------+-----------+--------+! 7dea7644-7013-42d8-98b7-e778f13f1389 ! basezone ! raw ! bare ! 382597120 ! active !+--------------------------------------+----------+-------------+------------------+-----------+--------+ Elastic Virtual Switch Configuration Since we are starting with a working OpenStack, I will not go into all of the key creation and setup for the EVS. But, using a pretty standard default configuration, you can see that I have set up my cloud so that it is using VLANs and is using an etherstub that I created as its uplink port.  This allows for clean creation of VLANs on top of that etherstub, regardless of switch configuration.  But, none of that matters for the flat networking. root@openstack:~# evsadm show-controlprop -o allPROPERTY PERM VALUE DEFAULT FLAT VLAN_RANGE VXLAN_RANGE HOSTl2-type rw vlan vlan -- -- -- --uplink-port rw l3stub0 -- no 200-300 -- --uri-template rw ssh:// ssh:// -- -- -- --uuid r- 33d487c6-29b2-11e5-b04f-ebdabd3733c8 -- -- -- -- --vlan-range rw 200-300 -- -- -- -- --vlan-range-avail r- 200-300 -- -- -- -- --vxlan-addr rw 0.0.0.0 0.0.0.0 -- -- -- --vxlan-ipvers rw v4 v4 -- -- -- --vxlan-mgroup rw 0.0.0.0 0.0.0.0 -- -- -- --vxlan-range rw -- -- -- -- -- --vxlan-range-avail r- -- -- -- -- -- --  Instead, we will configure net0 as an additional uplink port and tag it to use flat networking instead of VXLANs or VLANs.  Note that the EVS assumes that the same network interface is used for the same function on each node in the environment, unless you use the -h option when you set the control properties.  In that way, you could say that net0 might be the flat uplink port on one host, but net1 or aggr0 is the flat network uplink on some other.  Likewise for configuration of which VXLANs or VLANs are presented. root@openstack:~# evsadm set-controlprop -p uplink-port=net0,flat=yesroot@openstack:~# evsadm show-controlprop -o allPROPERTY PERM VALUE DEFAULT FLAT VLAN_RANGE VXLAN_RANGE HOSTl2-type rw vlan vlan -- -- -- --uplink-port rw l3stub0 -- no 200-300 -- --uplink-port rw net0 -- yes -- -- --uri-template rw ssh:// ssh:// -- -- -- --uuid r- 33d487c6-29b2-11e5-b04f-ebdabd3733c8 -- -- -- -- --vlan-range rw 200-300 -- -- -- -- --vlan-range-avail r- 200-300 -- -- -- -- --vxlan-addr rw 0.0.0.0 0.0.0.0 -- -- -- --vxlan-ipvers rw v4 v4 -- -- -- --vxlan-mgroup rw 0.0.0.0 0.0.0.0 -- -- -- --vxlan-range rw -- -- -- -- -- --vxlan-range-avail r- -- -- -- -- -- -- Creating a Flat Network Creating our flat network is pretty simple.  Just as with other networks, we configure a network and a subnet with neutron.  We will give an address range for dynamic address assignment via DHCP.  In this case, internal_network is an existing network created through Horizon in the standard way.  We will not use it for anything here. root@openstack:~# export OS_USERNAME=admin OS_PASSWORD=secrete OS_TENANT_NAME=demoroot@openstack:~# neutron net-list+--------------------------------------+------------------+------------------------------------------------------+! id ! name ! subnets !+--------------------------------------+------------------+------------------------------------------------------+! cec86f8b-9753-4d63-b56c-cafc4ee9b88d ! internal_network ! d7928b14-83f9-4e1b-a9c3-86f817028d87 192.168.33.0/24 !+--------------------------------------+------------------+------------------------------------------------------+root@openstack:~# neutron net-create flat_network --router:external true \ --provider:physical_network external --provider:network_type flatCreated a new network:+-----------------------+--------------------------------------+! Field ! Value !+-----------------------+--------------------------------------+! admin_state_up ! True !! id ! dc81a1dc-6f2c-42d9-bbeb-a4bcb5771dcd !! name ! flat_network !! provider:network_type ! flat !! router:external ! True !! shared ! False !! status ! ACTIVE !! subnets ! !! tenant_id ! 01059c84dd4f48f2b335c0ba70eab324 !+-----------------------+--------------------------------------+root@openstack:~# neutron subnet-create --name flat_subnet \ --allocation_pool start=10.0.2.200,end=10.0.2.210 flat_network 10.0.2.0/24Created a new subnet:+-------------------+----------------------------------------------+! Field ! Value !+-------------------+----------------------------------------------+! allocation_pools ! {"start": "10.0.2.200", "end": "10.0.2.210"} !! cidr ! 10.0.2.0/24 !! dns_nameservers ! !! enable_dhcp ! True !! gateway_ip ! 10.0.2.1 !! host_routes ! !! id ! c5286f85-b482-4fc8-9785-d6401d4c5178 !! ip_version ! 4 !! ipv6_address_mode ! !! ipv6_ra_mode ! !! name ! flat_subnet !! network_id ! dc81a1dc-6f2c-42d9-bbeb-a4bcb5771dcd !! tenant_id ! 01059c84dd4f48f2b335c0ba70eab324 !+-------------------+----------------------------------------------+root@openstack:~# neutron net-list+--------------------------------------+------------------+------------------------------------------------------+! id ! name ! subnets !+--------------------------------------+------------------+------------------------------------------------------+! cec86f8b-9753-4d63-b56c-cafc4ee9b88d ! internal_network ! d7928b14-83f9-4e1b-a9c3-86f817028d87 192.168.33.0/24 !! dc81a1dc-6f2c-42d9-bbeb-a4bcb5771dcd ! flat_network ! c5286f85-b482-4fc8-9785-d6401d4c5178 10.0.2.0/24 !+--------------------------------------+------------------+------------------------------------------------------+ At this point, you can use regular Horizon tools to launch a guest on this network. Select the flat network as the network for the guest and Nova will do the rest. It will insert whatever key you asked for, use DHCP within that address range to set an address and create the guest. However, a number of the enterprise customers I work with want more control than that. They want to create a guest with a specific IP address and sometimes even with a specific MAC. By default, OpenStack tends toward DHCP and dynamic assignment of MACs and IP addresses. However, there are many cases where you might want more control - firewall rules, centralized DNS and address assignment, LDOM VNET MAC assignment, etc. In this case, we will fall back to the command line to create our guests. Before having Nova create the guests, we will create the network ports the way we want them with Neutron, and then have Nova use the ports we already create. neutron port-create is used to create the port. We can specify a MAC or an IP address or both. In fact, there are a number of other parameters you can define with neutron port-create. See neutron help port-create for more details root@openstack:~# neutron port-create --mac-address 8:1:27:20:a0:1b \ --fixed-ip subnet_id=c5286f85-b482-4fc8-9785-d6401d4c5178,ip_address=10.0.2.222 flat_networkCreated a new port:+----------------+-----------------------------------------------------------------------------------+! Field ! Value !+----------------+-----------------------------------------------------------------------------------+! admin_state_up ! True !! device_id ! !! device_owner ! !! fixed_ips ! {"subnet_id": "c5286f85-b482-4fc8-9785-d6401d4c5178", "ip_address": "10.0.2.222"} !! id ! 79cd3b1f-9489-4036-9590-f5278cd3d598 !! mac_address ! 8:1:27:20:a0:1b !! name ! !! network_id ! dc81a1dc-6f2c-42d9-bbeb-a4bcb5771dcd !! status ! ACTIVE !! tenant_id ! 01059c84dd4f48f2b335c0ba70eab324 !+----------------+-----------------------------------------------------------------------------------+ And there we have a network port created with a particular MAC and a particular IP address. Again, we don't have to specify both, but for this example, we will. Now to use them. Lauching a Guest with a Specific Neutron Port Just to get a picture of our default environment, here are the list of our flavors, images, and keypairs. Pretty default. root@openstack:~# nova flavor-list+----+-----------------------------------------+-----------+------+-----------+------+-------+-------------+-----------+! ID ! Name ! Memory_MB ! Disk ! Ephemeral ! Swap ! VCPUs ! RXTX_Factor ! Is_Public !+----+-----------------------------------------+-----------+------+-----------+------+-------+-------------+-----------+! 1 ! Oracle Solaris kernel zone - tiny ! 2048 ! 10 ! 0 ! ! 1 ! 1.0 ! True !! 10 ! Oracle Solaris non-global zone - xlarge ! 16384 ! 80 ! 0 ! ! 32 ! 1.0 ! True !! 2 ! Oracle Solaris kernel zone - small ! 4096 ! 20 ! 0 ! ! 4 ! 1.0 ! True !! 3 ! Oracle Solaris kernel zone - medium ! 8192 ! 40 ! 0 ! ! 8 ! 1.0 ! True !! 4 ! Oracle Solaris kernel zone - large ! 16384 ! 40 ! 0 ! ! 16 ! 1.0 ! True !! 5 ! Oracle Solaris kernel zone - xlarge ! 32768 ! 80 ! 0 ! ! 32 ! 1.0 ! True !! 6 ! Oracle Solaris non-global zone - tiny ! 2048 ! 10 ! 0 ! ! 1 ! 1.0 ! True !! 7 ! Oracle Solaris non-global zone - small ! 3072 ! 20 ! 0 ! ! 4 ! 1.0 ! True !! 8 ! Oracle Solaris non-global zone - medium ! 4096 ! 40 ! 0 ! ! 8 ! 1.0 ! True !! 9 ! Oracle Solaris non-global zone - large ! 8192 ! 40 ! 0 ! ! 16 ! 1.0 ! True !+----+-----------------------------------------+-----------+------+-----------+------+-------+-------------+-----------+root@openstack:~# nova image-list+--------------------------------------+----------+--------+--------+! ID ! Name ! Status ! Server !+--------------------------------------+----------+--------+--------+! 7dea7644-7013-42d8-98b7-e778f13f1389 ! basezone ! ACTIVE ! !+--------------------------------------+----------+--------+--------+root@openstack:~# nova keypair-list+------+-------------------------------------------------+! Name ! Fingerprint !+------+-------------------------------------------------+! root ! 55:6c:ba:c5:bd:81:01:d1:cf:32:4c:cf:8d:c8:15:f5 !+------+-------------------------------------------------+ To launch our guest, we use nova boot and specify the details of the guest we want to launch. In this case, we will create a tiny NGZ, using the basezone images, the root keypair, and the neutron port we created earlier. root@openstack:~# nova boot --flavor 6 --image basezone --key-name root \ --nic port-id=79cd3b1f-9489-4036-9590-f5278cd3d598 flatz2+--------------------------------------+-------------------------------------------------+! Property ! Value !+--------------------------------------+-------------------------------------------------+! OS-DCF:diskConfig ! MANUAL !! OS-EXT-AZ:availability_zone ! nova !! OS-EXT-SRV-ATTR:host ! - !! OS-EXT-SRV-ATTR:hypervisor_hostname ! - !! OS-EXT-SRV-ATTR:instance_name ! instance-00000004 !! OS-EXT-STS:power_state ! 0 !! OS-EXT-STS:task_state ! scheduling !! OS-EXT-STS:vm_state ! building !! OS-SRV-USG:launched_at ! - !! OS-SRV-USG:terminated_at ! - !! accessIPv4 ! !! accessIPv6 ! !! adminPass ! gfQ6g3qXgqZf !! config_drive ! !! created ! 2015-07-17T15:47:38Z !! flavor ! Oracle Solaris non-global zone - tiny (6) !! hostId ! !! id ! ec42f82f-0399-46cd-b75b-807a5dd888bc !! image ! basezone (7dea7644-7013-42d8-98b7-e778f13f1389) !! key_name ! root !! metadata ! {} !! name ! flatz2 !! os-extended-volumes:volumes_attached ! [] !! progress ! 0 !! security_groups ! default !! status ! BUILD !! tenant_id ! 01059c84dd4f48f2b335c0ba70eab324 !! updated ! 2015-07-17T15:47:38Z !! user_id ! f2a3c09637d546c5a230991a4df9ca44 !+--------------------------------------+-------------------------------------------------+ And that's about all there is to that. Nova gets the guest built and uses the neutron network port we already created. We are able to specify our own MAC and IP as we go. Some colleagues of mine have done something similar using a first-boot service at install time to re-assign the DHCP IP address to one that they prefer. Their goal was to avoid the command line and avoid modifying Horizon. My customers are comfortable with command-line or API calls, since they plan to integrate this into an existing framework rather than using Horizon directly. I think this is simpler than having to do surgery after the fact. But, requirements vary. Just to prove this all works, we will log in. Notice our keypair lets us ssh in as root. We have the IP address that we set. Since this is an NGZ, it's hard to see the MAC quite as directly. So, from the global zone of the nova compute node hosting this guest, we can see the vnic used for this guest and we see that it has the MAC we assigned. Very cool! root@openstack:~# ssh root@10.0.2.222Last login: Fri Jul 17 15:53:35 2015 from host-10-0-2-201Oracle Corporation SunOS 5.11 11.3 June 2015root@flatz2:~# ipadmNAME CLASS/TYPE STATE UNDER ADDRlo0 loopback ok -- -- lo0/v4 static ok -- 127.0.0.1/8 lo0/v6 static ok -- ::1/128net0 ip ok -- -- net0/dhcp inherited ok -- 10.0.2.222/24root@flatz2:~# exitlogoutConnection to 10.0.2.222 closed.root@openstack:~# dladm show-vnic -mLINK OVER SPEED MACADDRESSES MACADDRTYPES IDSdh197a6832_2d_0 l3stub0 40000 fa:16:3e:97:5d:c5 fixed VID:200dh001d8202_30_0 net0 1000 fa:16:3e:c:d3:4a fixed VID:0instance-00000003/net0 net0 1000 fa:16:3e:3e:57:ed fixed VID:0instance-00000004/net0 net0 1000 8:1:27:20:a0:1b fixed VID:0instance-00000009/net0 net0 1000 fa:16:3e:8e:50:ec fixed VID:0 Conclusion While standard Neutron networking using VXLANs and VLANS provides the secure, flexible, dynamic networking environment that a multi-tenant cloud requires, sometimes that's not what you need. Many enterprises need are more controlled interface into legacy network infrastructures. Using flat networking, OpenStack can bridge the gap between the legacy network environment and the dynamic, elastic world of the cloud.

OpenStack Juno was recently released as a part of Solaris 11.3 Beta. There have been a number of very good blogs around the new capabilities of OpenStack Juno and how they relate to Solaris 11. ...

Ops Center

Ops Center 12c - Update - Provisioning Solaris on x86 Using a Card-Based NIC

Last week, I posted a blog describing how to use Ops Center to provision Solaris over the network via a NIC on a card rather than the built-in NIC.  Really, that was all about how to install Solaris on a SPARC system.  This week, we'll look at how to do the same thing for an x86-based server. Really, the overall process is exactly the same, at least for Solaris 11, with only minor updates. We will focus on Solaris 11 for this blog.  Once I verify that the same approach works for Solaris 10, I will provide another update. Booting Solaris 11 on x86 Just as before, in order to configure the server for network boot across a card-based NIC, it is necessary to declare the asset to associate the additional MACs with the server.  You likely will need to access the server console via the ILOM to figure out the MAC and to get a good idea of the network instance number.  The simplest way to find both of these is to start a network boot using the desired NIC and see where it appears in the list of network interfaces and what MAC is used when it tries to boot.  Go to the ILOM for the server.  Reset the server and start the console.  When the BIOS loads, select the boot menu, usually with Ctrl-P.  This will give you a menu of devices to boot from, including all of the NICs.  Select the NIC you want to boot from.  Its position in the list is a good indication of what network number Solaris will give the device. In this case, we want to boot from the 5th interface (GB_4, net4).  Pick it and start the boot processes.  When it starts to boot, you will see the MAC address for the interface Once you have the network instance and the MAC, go through the same process of declaring the asset as in the SPARC case.  This associates the additional network interface with the server.. Creating an OS Provisioning Plan The simplest way to do the boot via an alternate interface on an x86 system is to do a manual boot.  Update the OS provisioning profile as in the SPARC case to reflect the fact that we are booting from a different interface.  Update, in this case, the network boot device to be GB_4/net4, or the device corresponding to your network instance number.  Configure the profile to support manual network boot by checking the box for manual boot in the OS Provisioning profile. Booting the System Once you have created a profile and plan to support booting from the additional NIC, we are ready to install the server. Again, from the ILOM, reset the system and start the console.  When the BIOS loads, select boot from the Boot Menu as above.  Select the network interface from the list as before and start the boot process.  When the grub bootloader loads, the default boot image is the Solaris Text Installer.  On the grub menu, select Automated Installer and Ops Center takes over from there. Lessons The key lesson from all of this is that Ops Center is a valuable tool for provisioning servers whether they are connected via built-in network interfaces or via high-speed NICs on cards.  This is great news for modern datacenters using converged network infrastructures.  The process works for both SPARC and x86 Solaris installations.  And it's easy and repeatable.

Last week, I posted a blog describing how to use Ops Center to provision Solaris over the network via a NIC on a card rather than the built-in NIC.  Really, that was all about how to install Solaris...

Ops Center

Ops Center 12c - Provisioning Solaris Using a Card-Based NIC

It's been a long time since last I added something here, but having some conversations this last week, I got inspired to update things. I've been spending a lot of time with Ops Center for managing and installing systems these days.  So, I suspect a number of my upcoming posts will be in that area. Today, I want to look at how to provision Solaris using Ops Center when your network is not connected to one of the built-in NICs.  We'll talk about how this can work for both Solaris 10 and Solaris 11, since they are pretty similar.  In both cases, WANboot is a key piece of the story. Here's what I want to do:  I have a Sun Fire T2000 server with a Quad-GbE nxge card installed.  The only network is connected to port 2 on that card rather than the built-in network interfaces.  I want to install Solaris on it across the network, either Solaris 10 or Solaris 11.  I have met with a lot of customers lately who have a similar architecture.  Usually, they have T4-4 servers with the network connected via 10GbE connections. Add to this mix the fact that I use Ops Center to manage the systems in my lab, so I really would like to add this to Ops Center.  If possible, I would like this to be completely hands free.  I can't quite do that yet. Close, but not quite. WANBoot or Old-Style NetBoot? When a system is installed from the network, it needs some help getting the process rolling.  It has to figure out what its network configuration (IP address, gateway, etc.) ought to be.  It needs to figure out what server is going to help it boot and install, and it needs the instructions for the installation.  There are two different ways to bootstrap an installation of Solaris on SPARC across the network.   The old way uses a broadcast of RARP or more recently DHCP to obtain the IP configuration and the rest of the information needed.  The second is to explicitly configure this information in the OBP and use WANBoot for installation WANBoot has a number of benefits over broadcast-based installation: it is not restricted to a single subnet; it does not require special DHCP configuration or DHCP helpers; it uses standard HTTP and HTTPS protocols which traverse firewalls much more easily than NFS-based package installation.  But, WANBoot is not available on really old hardware and WANBoot requires the use o Flash Archives in Solaris 10.  Still, for many people, this is a great approach. As it turns out, WANBoot is necessary if you plan to install using a NIC on a card rather than a built-in NIC. Identifying Which Network Interface to Use One of the trickiest aspects to this process, and the one that actually requires manual intervention to set up, is identifying how the OBP and Solaris refer to the NIC that we want to use to boot.  The OBP already has device aliases configured for the built-in NICs called net, net0, net1, net2, net3.  The device alias net typically points to net0 so that when you issue the command  "boot net -v install", it uses net0 for the boot.  Our task is to figure out the network instance for the NIC we want to use.  We will need to get to the OBP console of the system we want to install in order to figure out what the network should be called.  I will presume you know how to get to the ok prompt.  Once there, we have to see what networks the OBP sees and identify which one is associated with our NIC using the OBP command show-nets. SunOS Release 5.11 Version 11.0 64-bitCopyright (c) 1983, 2011, Oracle and/or its affiliates. All rights reserved.{4} ok bannerSun Fire T200, No KeyboardCopyright (c) 1998, 2010, Oracle and/or its affiliates. All rights reserved.OpenBoot 4.30.4.b, 32640 MB memory available, Serial #69057548.Ethernet address 0:14:4f:1d:bc:c, Host ID: 841dbc0c.{4} ok show-netsa) /pci@7c0/pci@0/pci@2/network@0,1b) /pci@7c0/pci@0/pci@2/network@0c) /pci@780/pci@0/pci@8/network@0,3d) /pci@780/pci@0/pci@8/network@0,2e) /pci@780/pci@0/pci@8/network@0,1f) /pci@780/pci@0/pci@8/network@0g) /pci@780/pci@0/pci@1/network@0,1h) /pci@780/pci@0/pci@1/network@0q) NO SELECTION Enter Selection, q to quit: d/pci@780/pci@0/pci@8/network@0,2 has been selected.Type ^Y ( Control-Y ) to insert it in the command line. e.g. ok nvalias mydev ^Y for creating devalias mydev for /pci@780/pci@0/pci@8/network@0,2{4} ok devalias...net3 /pci@7c0/pci@0/pci@2/network@0,1net2 /pci@7c0/pci@0/pci@2/network@0net1 /pci@780/pci@0/pci@1/network@0,1net0 /pci@780/pci@0/pci@1/network@0net /pci@780/pci@0/pci@1/network@0...name aliases By looking at the devalias and the show-nets output, we can see that our Quad-GbE card must be the device nodes starting with  /pci@780/pci@0/pci@8/network@0.  The cable for our network is plugged into the 3rd slot, so the device address for our network must be /pci@780/pci@0/pci@8/network@0,2. With that, we can create a device alias for our network interface.  Naming the device alias may take a little bit of trial and error, especially in Solaris 11 where the device alias seems to matter more with the new virtualized network stack. So far in my testing, since this is the "next" network interface to be used, I have found success in naming it net4, even though it's a NIC in the middle of a card that might, by rights, be called net6 (assuming the 0th interface on the card is the next interface identified by Solaris and this is the 3rd interface on the card).  So, we will call it net4.  We need to assign a device alias to it: {4} ok nvalias net4 /pci@780/pci@0/pci@8/network@0,2{4} ok devaliasnet4 /pci@780/pci@0/pci@8/network@0,2... We also may need to have the MAC for this particular interface, so let's get it, too.  To do this, we go to the device and interrogate its properties. {4} ok cd /pci@780/pci@0/pci@8/network@0,2{4} ok .propertiesassigned-addresses 82060210 00000000 03000000 00000000 01000000 82060218 00000000 00320000 00000000 00008000 82060220 00000000 00328000 00000000 00008000 82060230 00000000 00600000 00000000 00100000 local-mac-address 00 21 28 20 42 92 phy-type mif... From this, we can see that the MAC for this interface is  00:21:28:20:42:92.  We will need this later. This is all we need to do at the OBP.  Now, we can configure Ops Center to use this interface. Network Boot in Solaris 10 Solaris 10 turns out to be a little simpler than Solaris 11 for this sort of a network boot.  Since WANBoot in Solaris 10 fetches a specified In order to install the system using Ops Center, it is necessary to create a OS Provisioning profile and its corresponding plan.  I am going to presume that you already know how to do this within Ops Center 12c and I will just cover the differences between a regular profile and a profile that can use an alternate interface. Create a OS Provisioning profile for Solaris 10 as usual.  However, when you specify the network resources for the primary network, click on the name of the NIC, probably GB_0, and rename it to GB_N/netN, where N is the instance number you used previously in creating the device alias.  This is where the trial and error may come into play.  You may need to try a few instance numbers before you, the OBP, and Solaris all agree on the instance number.  Mark this as the boot network. For Solaris 10, you ought to be able to then apply the OS Provisioning profile to the server and it should install using that interface.  And if you put your cards in the same slots and plug the networks into the same NICs, this profile is reusable across multiple servers. Why This Works If you watch the console as Solaris boots during the OSP process, Ops Center is going to look for the device alias netN.  Since WANBoot requires a device alias called just net, Ops Center uses the value of your netN device alias and assigns that device to the net alias.  That means that boot net will automatically use this device.  Very cool!  Here's a trace from the console as Ops Center provisions a server: Sun Sun Fire T200, No KeyboardCopyright (c) 1998, 2010, Oracle and/or its affiliates. All rights reserved.OpenBoot 4.30.4.b, 32640 MB memory available, Serial #69057548.Ethernet address 0:14:4f:1d:bc:c, Host ID: 841dbc0c.auto-boot? =            false{0} ok  {0} ok printenv network-boot-argumentsnetwork-boot-arguments =  host-ip=10.140.204.234,router-ip=10.140.204.1,subnet-mask=255.255.254.0,hostname=atl-sewr-52,client-id=0100144F1DBC0C,file=http://10.140.204.22:5555/cgi-bin/wanboot-cgi{0} ok {0} ok devalias net net                      /pci@780/pci@0/pci@1/network@0{0} ok devalias net4 net4                     /pci@780/pci@0/pci@8/network@0,2{0} ok devalias net /pci@780/pci@0/pci@8/network@0,2{0} ok setenv network-boot-arguments host-ip=10.140.204.234,router-ip=10.140.204.1,subnet-mask=255.255.254.0,hostname=atl-sewr-52,client-id=0100144F1DBC0C,file=http://10.140.204.22:8004/cgi-bin/wanboot-cginetwork-boot-arguments =  host-ip=10.140.204.234,router-ip=10.140.204.1,subnet-mask=255.255.254.0,hostname=atl-sewr-52,client-id=0100144F1DBC0C,file=http://10.140.204.22:8004/cgi-bin/wanboot-cgi{0} ok {0} ok boot net - installBoot device: /pci@780/pci@0/pci@8/network@0,2  File and args: - install/pci@780/pci@0/pci@8/network@0,2: 1000 Mbps link up<time unavailable> wanboot info: WAN boot messages->console<time unavailable> wanboot info: configuring /pci@780/pci@0/pci@8/network@0,2 See what happened?  Ops Center looked for the network device alias called net4 that we specified in the profile, took the value from it, and made it the net device alias for the boot.  Pretty cool! WANBoot and Solaris 11 Solaris 11 requires an additional step since the Automated Installer in Solaris 11 uses the MAC address of the network to figure out which manifest to use for system installation.  In order to make sure this is available, we have to take an extra step to associate the MAC of the NIC on the card with the host.  So, in addition to creating the device alias like we did above, we also have to declare to Ops Center that the host has this new MAC. Declaring the NIC Start out by discovering the hardware as usual.  Once you have discovered it, take a look under the Connectivity tab to see what networks it has discovered.  In the case of this system, it shows the 4 built-in networks, but not the networks on the additional cards.  These are not directly visible to the system controller.  In order to add the additional network interface to the hardware asset, it is necessary to Declare it.  We will declare that we have a server with this additional NIC, but we will also  specify the existing GB_0 network so that Ops Center can associate the right resources together.  The GB_0 acts as sort of a key to tie our new declaration to the old system already discovered.  Go to the Assets tab, select All Assets, and then in the Actions tab, select Add Asset.  Rather than going through a discovery this time, we will manually declare a new asset. When we declare it, we will give the hostname, IP address, system model that match those that have already been discovered.  Then, we will declare both GB_0 with its existing MAC and the new GB_4 with its MAC.  Remember that we collected the MAC for GB_4 when we created its device alias. After you declare the asset, you will see the new NIC in the connectivity tab for the asset.  You will notice that only the NICs you listed when you declared it are seen now.  If you want Ops Center to see all of the existing NICs as well as the additional one, declare them as well.  Add the other GB_1, GB_2, GB_3 links and their MACs just as you did GB_0 and GB_4.  Installing the OS  Once you have declared the asset, you can create an OS Provisioning profile for Solaris 11 in the same way that you did for Solaris 10.  The only difference from any other provisioning profile you might have created already is the network to use for installation.  Again, use GB_N/netN where N is the interface number you used for your device alias and in your declaration.  And away you go.  When the system boots from the network, the automated installer (AI) is able to see which system manifest to use, based on the new MAC that was associated, and the system gets installed. {0} ok {0} ok printenv network-boot-argumentsnetwork-boot-arguments =  host-ip=10.140.204.234,router-ip=10.140.204.1,subnet-mask=255.255.254.0,hostname=atl-sewr-52,client-id=01002128204292,file=http://10.140.204.22:5555/cgi-bin/wanboot-cgi{0} ok {0} ok devalias net net                      /pci@780/pci@0/pci@1/network@0{0} ok devalias net4 net4                     /pci@780/pci@0/pci@8/network@0,2{0} ok devalias net /pci@780/pci@0/pci@8/network@0,2{0} ok setenv network-boot-arguments host-ip=10.140.204.234,router-ip=10.140.204.1,subnet-mask=255.255.254.0,hostname=atl-sewr-52,client-id=01002128204292,file=http://10.140.204.22:5555/cgi-bin/wanboot-cginetwork-boot-arguments =  host-ip=10.140.204.234,router-ip=10.140.204.1,subnet-mask=255.255.254.0,hostname=atl-sewr-52,client-id=01002128204292,file=http://10.140.204.22:5555/cgi-bin/wanboot-cgi{0} ok {0} ok boot net - installBoot device: /pci@780/pci@0/pci@8/network@0,2  File and args: - install/pci@780/pci@0/pci@8/network@0,2: 1000 Mbps link up<time unavailable> wanboot info: WAN boot messages->console<time unavailable> wanboot info: configuring /pci@780/pci@0/pci@8/network@0,2...SunOS Release 5.11 Version 11.0 64-bitCopyright (c) 1983, 2011, Oracle and/or its affiliates. All rights reserved.Remounting root read/writeProbing for device nodes ...Preparing network image for useDownloading solaris.zlib--2012-02-17 15:10:17--  http://10.140.204.22:5555/var/js/AI/sparc//solaris.zlibConnecting to 10.140.204.22:5555... connected.HTTP request sent, awaiting response... 200 OKLength: 126752256 (121M) [text/plain]Saving to: `/tmp/solaris.zlib'100%[======================================>] 126,752,256 28.6M/s   in 4.4s    2012-02-17 15:10:21 (27.3 MB/s) - `/tmp/solaris.zlib' saved [126752256/126752256] Conclusion So, why go to all of this trouble?  More and more, I find that customers are wiring their data center to only use higher speed networks - 10GbE only to the hosts.  Some customers are moving aggressively toward consolidated networks combining storage and network on CNA NICs.  All of this means that network-based provisioning cannot rely exclusively on the built-in network interfaces.  So, it's important to be able to provision a system using other than the built-in networks.  Turns out, that this is pretty straight-forward for both Solaris 10 and Solaris 11 and fits into the Ops Center deployment process quite nicely. Hopefully, you will be able to use this as you build out your own private cloud solutions with Ops Center.

It's been a long time since last I added something here, but having some conversations this last week, I got inspired to update things. I've been spending a lot of time with Ops Center for managing and...

Everything Else

Oracle OpenWorld Session - Case Study: High-Performance Cloud Datastore Using Oracle Solaris 11 and Oracle Solaris ZFS

How's that for a long title? Just a note that the slides are now available from our ZFS in the cloud session at OpenWorld.  Tom Shafron, CEO of Viewbiquity, and I presented on the new features in ZFS and how Viewbiquity is using them to provide a cloud-based data storage system. From our abstract: Oracle Solaris ZFS, a key feature of Oracle Solaris, integrates the concepts of a file system and volume management capabilities to deliver simple, fast data management. This session provides a case study of Viewbiquity, a provider of market-leading, innovative M2M platforms. Its cloud-based platform integrates command and control, video, VoIP, data logging and management, asset tracking, automated responses, and advanced process automation. Viewbiquity relies on Oracle Solaris ZFS and Oracle Solaris for fast write with integrated hybrid traditional and solid-state storage capabilities, snapshots for backups, built-in deduplication, and compression for data storage efficiency. Learn what Oracle Solaris ZFS is and how you can deploy it in high-performance, high-availability environments. Viewbiquity is doing some cool stuff with ZFS, so take a look at the slides (found here) to see how you, too, can use ZFS in some really cool ways.

How's that for a long title? Just a note that the slides are now available from our ZFS in the cloud session at OpenWorld.  Tom Shafron, CEO of Viewbiquity, and I presented on the new features in ZFS...

Solaris

How much does Live Upgrade patching save you? Lots!

For a long time, I have advocated that Solaris users adopt ZFS for root, storing the operating system in ZFS.  I've also strongly advocated for using Live Upgrade as a patching tool in this case.  The benefits are intuitive and striking, but are they actual and quantifiable? Background You can find a number of bloggers on BOC talking about the hows and whys of ZFS root.  Suffice it to say that ZFS has a number of great qualities that make management of the operating system simpler, especially when combined with other tools like Live Upgrade.  ZFS allows for the immediate creation of as many snapshots as you might want, simply by preserving the view of the filesystem meta-data and taking advantage of the fact that all writes in ZFS use copy-on-write, completely writing the new data before releasing the old.  The gives us snapshots for free. Like chocolate and peanut butter, ZFS and Live Upgrade are two great tastes that taste great together.  Live Upgrade, traditionally was used just to upgrade systems from one release of Solaris (or update release) to another.  However, in Solaris 10, it now becomes a tremendous tool for patching.  With Live Upgrade (LU), the operating system is replicated in an Alternate Boot Environment (ABE) and all of the changes (patches, upgrades, whatever) are done to the copy of the OS while the OS is running, rather than taking the system down to apply maintenance.  Then, when the time is right, during a maintenance window, the new changes are activated by rebooting using the ABE.  With this approach, downtime is minimized since changes are applied while the system is running.  Moreover, there is a fall-back procedure since the original boot environment is still there.  Rebooting again into the original environment effectively cancels out the changes exactly. The Problem with Patching Patching, generally speaking, is something that everyone knows they need to do, but few are really happy with how they do it.  It's not much fun.  It takes time.  It keeps coming around like clockwork.  You have to do it to every system.  You have to work around the schedules of the applications on the system.  But, it needs to be done.  Sort of like mowing the grass every week in the summer. Live Upgrade can take a lot of the pain out of patching, since the actual application of the patches no longer has to be done while the system is shut down.  A typical non-LU approach for patches is to shut the system down to single-user prior to applying the patches.  In this way, you are certain that nothing else is going on on the system and you can change anything that might need to be updated safely.  But, the application is also down during this entire period.  And that is the crux of the problem.  Patching takes too long; we expect systems to be always available these days. How long does patching take?  That all depends.  It depends on the number of changes and patches being applied to the system.  If you have not patched in a very long time, then a large number of patches are required to bring the system current.  The more patches you apply, the longer it takes. It depends on the complexity of the system.  If, for example, there are Solaris zones on the system for virtualization, patches applied to the system are automatically applied to each of the Zones as well.  This takes extra time.  If patches are begin applied to a shut-down system, that just extends the outage.  It's hard to get outage windows in any case and long outage windows are especially hard to schedule especially when the perceived benefit is small.  Patches are like a flu shot.  They can vaccinate you against problems that have been found, but they won't help if this year has a new strain of flu that we've not seen before.  So, long outage across lots of systems are hard to justify. So, How Long Does It Really Take I have long heard people talk about how patching takes too long, but I've not measured it in some time.  So, I decided to do a bit of an experiment.  Using a couple of different systems, neither one very fast or very new, I applied the Solaris 10 Recommended patch set from mid-July 2011.  I applied the patches to systems running different update releases of Solaris 10.  This gives different numbers of patches that have to be applied to bring the system current.  As far as procedure goes, for each test, I shut the system down to single-user (init S), applied the patches, and rebooted.  The times listed are just the time for the patching, although the actual maintenance window in real-life would include time to shut down, time to reboot, and time to validate system operation.  The two systems I used for my tests were an X4100 server with 2 dual-core Opteron processors and 16GB of memory and a Sun Fire V480 server with 4 UltraSPARC III+ processors.  Clearly, these are not new systems, but they will show what we need to see.  System  Operating System  Patches Applied  Elapsed Time (hh:mm:ss) X4100 Solaris 10 9/10 105 00:17:00 X4100 Solaris 10 10/09 166 00:26:00 X4100 Solaris 10 10/08 216 00:36:06 V480 Solaris 10 9/10 99 00:47:29 For each of these tests, the server is installed with root on ZFS and patches are applied from the Recommended Patchset via the command "./installpatch -d --<pw>" for whatever password this patchset has.  All of this is done while the system was in single-user rather than while running multi-user. It appears that clock speed is important when applying patches.  The older V480 took three times as long as the X4100 for the same patchset. And this is the crux of the problem.  Even to apply patches to a pretty current system requires an extended outage.  This does not even take into account the time required for whatever internal validation of the work done, reboot time, application restart time, etc.  How can we make this better?  Let's make it worse first. More Complicated Systems Take Longer to Patch Nearly a quarter of all production systems running Solaris 10 are deployed using Solaris Zones.  Many more non-production systems may also use zones.  Zones allow me to consolidate the administrative overhead of only having to patch the global zone rather than each virtualized environment.  But, when applying patches to the global zone, patches are automatically applied to each zone in turn.  So, the time to patch a system can be significantly increased by having  multiple zones.  Let's first see how much longer this might take, and then we will show two solutions. System Operating System Number of Zones Patches Applied Elapsed Time (hh:mm:ss) X4100 Solaris 10 9/10 2 105 00:46:51 X4100 Solaris 10 9/10 20 105 03:03:59 X4100 Solaris 10 10/09 2 166 01:17:17 X4100 Solaris 10 10/08 2 216 01:37:17 V480 Solaris 10 9/10 2 99 01:53:59 Again, all of these patches were applied to systems in single-user in the same way as the previous set.  Just having two (sparse-root) zones defined took nearly three times as long as just the global zone alone.  Having 20 zones installed took the patch time from 17 minutes to over three hours for even the smallest tested patchset. How Can We Improve This?  Live Upgrade is Your Friend There are two main ways that this patch time can be improved.  One applies to systems with or without zones, while the second improves on the first for systems with zones installed. I mentioned before that Live Upgrade is very much your friend.  Rather than go into all the details of LU, I would refer you to the many other blogs and documents on LU.  Check out especially Bob Netherton's Blog for lots of LU articles. When we use LU, rather than taking the system down to single-user, we are able to create a new alternate boot environment, using ZFS snapshot and clone capability, while the system is up, running in production.  Then, we apply the patches to that new boot environment, still using the installpatchset command.  For example, "./installpatchset -d -B NewABE --<pw>" to apply the patches into NewABE rather than the current boot environment.  When we use this approach, the patch times that we saw before improve don't change very much, since the same work is being done.  However, all of this is time that the system is not out of service.  The outage is only the time required to reboot into the new boot environment. So, Live Upgrade saves us all of that outage time.  Customers who have older servers and are fairly out of date on patches say that applying a patch bundle can take more than four or five hours, an outage window that is completely unworkable.  With Live Upgrade, the outage is reduced to the time for a reboot, scheduled when it can be most convenient. Live Upgrade Plus Parallel Patching Recently, another enhancement was made to patching so that multiple zones are patched in parallel.  Check out Jeff Victor's blog where he explains how this all works.  As it turns out, this parallel patching works whether you are patching zones in single-user or via Live Upgrade.  So, just to get an idea of how this might help I tried to do some simple measurement with 2 and 20 sparse-root zones created on a system running Solaris 10 9/10. System Operating System Number of Zones Patches Applied num_procs Elapsed Time (hh:mm:ss) X4100 Solaris 10 9/10 2 105 1 00:46:51 X4100 Solaris 10 9/10 2 105 2 00:36:04 X4100 Solaris 10 9/10 20 105 1 03:03:59 X4100 Solaris 10 9/10 20 105 2 01:55:58 X4100 Solaris 10 9/10 20 105 4 01:25:53 num_procs is used as a guide for the number of threads to be engaged in parallel patching.  Jeff Victor's blog (above) and the man page for pdo.conf talk about how this relates to the actual number of processes that are used for patching. With only two zones, doubling the number of threads has an effect, but not a huge effect, since the amount of parallelism is limited.  However, with 20 zones on a system, boosting the number of zones patched in parallel can significantly reduce the time taken for patching. Recall that all of this is done within the application of patches with Live Upgrade.  Used alone, outside of Live Upgrade, this can help reduce the time required to patch a system during a maintenance window.  Used with Live Upgrade, it reduces the time required to apply patches to the alternate boot environment. So, what should you do to speed up patching and reduce the outage required for patching?  Use ZFS root and Live Upgrade so that you can apply your patches to an alternate boot environment while the system is up and running.  Then, use parallel patching to reduce the time required to apply the patches to the alternate boot environment where you have zones deployed.

For a long time, I have advocated that Solaris users adopt ZFS for root, storing the operating system in ZFS.  I've also strongly advocated for using Live Upgrade as a patching tool in this case.  The...

Solaris

Finish Scripts and First-Boot Services with Bootable AI

In my last couple of blogs, I have talked about using the Automated Installer, specifically using Bootable AI.  We talked about creating a manifest to guide the installation and we talked about creating a system configuration manifest to configure the system identity.  The piece that has not yet been addressed, from the perspective of a long-time Jumpstart user, is how to do the final customization via finish scripts.In general, there is no notion of an install script that is bundled into a package when using the Image Package System in Solaris 11 Express as there is with SVR4 packages from Solaris 10.  However, a package can install a service that can carry out the same function.  Its operation is a bit more controlled since it has to have its dependencies satisfied and can run in a reduced privilege environment if needed.  Additionally, many of the actions typically scripted into installation scripts, such as creation of users, groups, links, directories, are all built-in actions of the packaging system.So, the question arises of how to use the IPS packaging system to add our own packages to a system, whether at installation time or later, and how to perform the necessary first-boot customizations to a system we are installing.  The requirement to create our own packages comes from the fact that there is no other way to deliver content to the system being installed during the installation process except through the AI manifest - and that means IPS packages.  In Jumpstart, there was a variable set at installation that pointed to a particular NFS-mounted directory where install scripts could reside.  This was all well and good so long as you could mount that directory.  When it was not available, you were left again with the notion of creating and delivering SVR4 packages via the Jumpstart profile.  So, the situation is not far different than in Solaris 10 and earlier.  There's just a little different syntax and a different network protocol in use to deliver the payload.Why Use a Local RepositoryThere are two main reasons to create a local repository for use by AI and IPS.  First, you might choose to replicate the Solaris repository rather than make a home-run through your corporate network firewall to Oracle for every package installation on every server.  Performance and control are clearly going to be better and more deterministic by locating the data closer to where you plan to use it.  The second reason to create a local repository would be to host your own locally provided packages - whether developed locally or provided by an ISV.The question then arises whether to combine both of these into the same repository.  My personal opinion is that it is better to keep them separate.  Just as it is a good practice to keep the binaries and files of the OS separate from those locally created or provided by applications on the disk, it seems a good idea to keep the repositories separate.  That does not mean that multiple systems are needed to host the data, however.  Multiple repository services can be hosted on the same system on different network ports, pointing to different directories on the disk.  Think of it like hosting multiple web sites with different ports and different htdocs directories.Rather than go through all the details of creating a local mirror of the Solaris repository, I will refer you to Brian Leonard's blog on that topic.  Here, he talks about creating a local mirror from the repository ISO.  He also shows how you can create a file-based repository for local use.For much of the rest of this exercise, I am relying on an article in the OpenSolaris Migration Hub that talks about creating a First Boot Service, which in turn references a page on creating a local package repository.  It is well worth it to read through these two pages. Setting Up A Local RepositorySo far, we have avoided the use of any sort of install server.  However, to have a local repository available during installation, this becomes a necessity.  So, pick a server to be used as a package repository.Rather than clone the Solaris repository, we will create a new, empty repository to fill with our own packages.  We will configure the necessary SMF properties to enable the repository.  And then we will fill it with the packages that we need to deploy.On the host that will host the repository, select a location in its filesystem and set it aside for this function.  A good practice would be to create a ZFS filesystem for the repository.  In this case, you can enable compression on the repository and easily control its size through quotas and reservations.  In this example, we will just create a ZFS filesystem within the root pool.  Often you will have a separate pool for this sort of function.# zfs create -p -o mountpoint=/var/repos/mycompany.com/repo -o compression=on rpool/repos/mycompany.com/repoNext, we will need to set the SMF properties to support the repository.  The service application/pkg/server is responsible for managing the actual package depot process.  As such, it refers to properties to locate the repository on disk, establish what port to use, etc.The property pkg/inst_root specifies where on the repository server's disk the repo resides.  # svccfg -s application/pkg/server setprop pkg/inst_root=/rpool/repo0906/repopkg/readonly specifies whether or not the repository can be updated.  Typically, for a cloned Solaris repository, this will be set to true.  It also is a good practice to set it to true when it should not be updated.# svccfg -s application/pkg/server setprop pkg/readonly=truepkg/prefix specifies the name that the repository will take when specified as a publisher by a client system.  pkg/port specifies the port where the repository will answer.# svccfg -s application/pkg/server setprop pkg/prefix=local-pkgs# svccfg -s application/pkg/server setprop pkg/port=9000Once the properties are set, refresh and enable the service.# svcadm refresh application/pkg/server# svcadm enable application/pkg/serverCreating a First-Boot PackageNow that the repository has been created, we need to create a package to go into that repository.  Since there are no post-install scripts with the Image Packaging System, we will create an SMF service that will be automatically enabled so that it will run when the system boots.  One technique used with Jumpstart was to install a script into /etc/rc3.d that would run late in the boot sequence and would then remove itself so that it would only run on the first boot.  We will take a similar path with our first-boot service.  We will have it disable itself so that it doesn't continue to run on each boot.  There are two parts to creating this simple package.  First, we have to create the manifest for the service, and second, we have to create the script that will be used as the start method within the service.  This area is covered in more depth in the OpenSolaris Migration Hub paper on Creating a First Boot Service.  In fact, we will use the manifest from that paper.Creating the ManifestWe will use the manifest from Creating a First Boot Service directly and call it finish-script-manifest.xml.  The main points to see here are the service is enabled automatically when we import the manifest the service is dependent on svc:/milestone/multi-user so that it won't run until the system is in the Solaris 11 Express equivalent of run level 3. the script /usr/bin/finish-script.sh, which we will provide, is going to be run as the start method when the service begins. finish-script-manifest.xml: <?xml version="1.0"?><!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1"><service_bundle type='manifest' name='Finish:finish-script'><service    name='system/finish-script'    type='service'    version='1'>   <create_default_instance enabled='true' />    <single_instance />    <dependency name='autofs' grouping='require_all' restart_on='none' type='service'>        <service_fmri value='svc:/system/filesystem/autofs:default' />    </dependency> <dependency name='multi-user' grouping='require_all' restart_on='none' type='service'>      <service_fmri value='svc:/milestone/multi-user:default' />   </dependency> <exec_method        type='method'        name='start'        exec='/usr/bin/finish-script.sh'        timeout_seconds='0'>    </exec_method>   <exec_method        type='method'        name='stop'        exec=':true'        timeout_seconds='0'>    </exec_method>        <property_group name='startd' type='framework'>                <propval name='duration' type='astring' value='transient' />        </property_group></service></service_bundle> Creating the ScriptIn this example, we will create a trivial finish script.  All it will do is log that it has run and then disable itself.  You could go so far as to have the finish script uninstall itself.  However, rather than do that we will just disable the service.  Certainly, you could have a much more expansive finish script, with multiple files and multiple functions.  Our script is short and simple: finish-script.sh: #!/usr/bin/bashsvcadm disable finish-script#pkg uninstall pkg:/finishecho "Completed Finish Script" > /var/tmp/finish_log.$$exit 0 Adding Packages to RepositoryNow that we have created the finish script and the manifest for the first-boot service, we have to insert these into the package repository that we created earlier.  Take a look at the pkgsend man page for a lot more details about how all of this works.  It's possible with pkgsend to add SVR4 package bundles into the repository, as well as tar-balls and directories full of files.  Since our package is simple, we will just insert each package.When we open the package in the repository, we have to specify the version number.  Take a good look at the pkg(5) man page to understand the version numbering and the various actions that could be part of the package.  Since I have been working on this script, I have decided that it is version 0.3.  We start by opening the package for insertion.  Then, we add each file to the package in the repository, specifying file ownership, permissions, and path.  Once all the pieces have been added, we close the package and the FMRI for the package is returned.# eval `pkgsend -s http://localhost:9000/ open finish@0.3`# pkgsend -s http://localhost:9000/ add file finish-script-manifest.xml mode=0555 owner=root group=bin path=/var/svc/manifest/system/finish-script-manifest.xml restart_fmri=svc:/system/manifest-import:default# pkgsend -s http://localhost:9000/ add file finish-script.sh mode=0555 owner=root group=bin path=/usr/bin/finish-script.sh# pkgsend -s http://localhost:9000/ closePUBLISHEDpkg://local-pkgs/finish@0.3,5.11:20101220T174741ZUpdating the AI ManifestNow that the repository is created, we can add the new repository as a publisher to verify that it has the contents we expect.  On the package server, itself, we can add this publisher.  Remember that we set the prefix for the publisher to local-pkgs and that we specified it should run on port 9000.  This name could be anything that makes sense for your enterprise - perhaps the domain for the company or something that will identify it as local rather than part of Solaris is a good choice.# pkg set-publisher -p http://localhost:9000 local-pkgspkg set-publisher:  Added publisher(s): local-pkgs# pkg list -n "pkg://local-pkgs/\*"NAME (PUBLISHER)                                    VERSION         STATE      UFOXIfinish (local-pkgs)                                 0.3             known      -----Now that we know the repository is on-line and contains the packages that we want to deploy, we have to update the AI manifest to reflect both the new publisher and the new packages to install.  First, to update the publisher, beneath the section that specifies the default publisher, add a second source for this new publisher:       <source>        <publisher name="solaris">          <origin name="http://pkg.oracle.com/solaris/release/"/>        </publisher>      </source>      <source>         <publisher name="local-pkgs">           <origin name="http://my-local-repository-server:9000/"/>         </publisher>      </source> Then, add the packages to the list of packages to install.  Depending on how you named your package, you may not need to specify the repository for installation.  IPS will search the path of repositories to find each package.  However, it's not a bad idea to be specific.       <software_data action="install" type="IPS">        <name>pkg:/entire</name>        <name>pkg:/server_install</name>        <name>pkg://local-pkgs/finish</name> Now, when you install the system using this manifest, in addition to the regular Solaris bits being installed, and the system configuration services executing, the finish script included in your package will be run on the first boot.  Since the script called by the service turns itself off, it will not continue to run on subsequent boots.  You can do whatever sort of additional configuration that you might need to do.  But, before you spend a long time converting all of your old Jumpstart scripts into a first-boot service, take a look at the built-in capabilities of AI and of Solaris 11 Express in general.  It may be that much of the work you had to code yourself before is no longer required.  For example, IP interface configuration is simple and persistent with ipadm.  Other new functions in Solaris 11 Express remove the need to write custom code, too.  But for the cases where you need custom code, this sort of first-boot service gives you a hook so that you can do what you need.

In my last couple of blogs, I have talked about using the Automated Installer, specifically using Bootable AI.  We talked about creating a manifest to guide the installation and we talked...

Solaris

Booting and Installing with Bootable AI

My last couple of blogs have been about creating a manifest to be used when installing a system with the Solaris 11 Express Automated Installer.  Now that we have this basic manifest constructed, let's install a system.To review, the Automated Installer is the facility in Solaris 11 Express that supports network-based installation.  The manifest used with AI determines what is installed and how the system is customized during installation.  Typically, one would set up a network install service and fetch the manifest from there.  However, sometimes this is not desired or practical.  An option in these cases is to use bootable AI.  In this case, you boot the system to be installed from the AI ISO image.  During the boot process, you are prompted for a URL that points to a valid AI manifest.  This manifest is just fetched using HTTP (wget is actually used).  So, so long as you can get to the manifest, you are good to go.  Once fetched, the manifest is validated and acted upon to complete the installation.In this installment, we will go through the boot process.  In particular, I will show what this looks like on an x86 host.  For details on SPARC, see my previous blog.So, to start, boot your system from the Automated Install (AI) ISO.  When presented with the Grub menu, the default selection is to use a custom manifest.  This selection is the one we want and will prompt us for the URL of the manifest.  The other options allow you to use the default manifest built into the ISO and to perform the installation across a serial connection.After the system has booted the small Solaris image on the AI ISO, you will be prompted for the URL of the manifest.AI fetches the manifest and begins the installation process.  The installation goes on in the background rather than on the console.  Like the LiveCD installation, Solaris is running at this point.  If you want to monitor the progress of the installation, log in as the default admin user included in the ISO image.  The default login name is jack, with the password jack.  Once you have logged in, you can monitor the progress of the installation by tailing the installation log, found in /tmp/install_log.Since Solaris is up and running, it is possible to enable network logins to the system while it is still installing.  To do this, su to root and enable the ssh service  (svcadm enable ssh).  Once ssh is enabled, you can ssh into the system as jack.  I have sometimes found this to be a useful tool when installing a virtual machine over a slow and unreliable network, where VNC is unable to sustain its required bandwidth. Once the installation completes, you can reboot the system.  Of course, the regular first-boot SMF import will happen.  And the services that we configured in the last section will be activated to configure the networks and system identity. Once all of this complete, the system is ready for use.  The piece that you might have noticed is missing is any sort of finish-script customization.  Stay tuned for future installments to cover this.Using the bootable AI, it is simple to provide a manifest via a simple URL and perform a near-hands-free, customized installation of the system.  It is important to note that DHCP is still used to fetch an address for the system, along with routing and DNS information.  For a truly hands-free installation, a network-based install server and install service would have to be created.

My last couple of blogs have been about creating a manifest to be used when installing a system with the Solaris 11 Express Automated Installer.  Now that we have this basic manifest...

Solaris

Using System Configuration Manifests with Bootable AI

In my last blog, I talked about how to configure a manifest for a bootable AI installation.  The main thing there was how to select which packages to install.  This time we are going to talk about how to handle AI's version of sysidcfg and configuring system identity at install-time.In a Jumpstart world, many of the things that make up a system's identity - hostname, network configuration, timezone, naming services, etc. - can be configured at installation time by providing a sysidcfg file.  Alternately, an interactive dialog starts and prompts the installer for this sort of information.The System Configuration manifest provides this same sort of information in an Automated Installer world.  The documentation for AI shows how to create either a separate or an embedded SC manifest to be served by an AI server.  When using Bootable AI, the SC manifest needs to be embedded within the AI manifest.  The SC manifest, whether embedded or not, is basically an XML document that is providing a bunch of properties for SMF services that are going to run on the first system boot to help complete the system configuration.  Some of the main tasks that can be completed in the SC manifest are: Identify and configure the administrative "first user" created at install time. Specify a root password and whether root is a role or a standard user Configure timezone, hostname, keyboard maps, terminal type Specify whether the network should be configured automatically or manually. Configure network settings, including DNS, for manually configured networks But, in the end, all of this is just setting SMF properties, so it's pretty straightforward.  It appears as a large service_bundle with properties for multiple SMF services.As far as including the SC manifest information in the bootable AI manifest, the SC manifest is essentially embedded into the AI manifest as a large comment.  Don't be put off by the comment notation.  This whole section is passed on to SMF to assign the necessary service properties.In order to explain the various sections of properties, I will just annotate an updated SC manifest.  In this manifest, I will specify some of the more common configuration settings you might use.  The whole SC embedded manifest is identified within the AI manifest with the tag sc_embedded_manifest.  See the Automated Installer Guide for more details on the rest of the options to this tag.  The two lines following the sc_embedded_manifest tag are just the top part of the stand-alone SC manifest XML document.  Look in the default AI manifest for exact placement of this section.     <sc_embedded_manifest name="AI">      <!-- <?xml version='1.0'?>      <!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">     The rest of the SC manifest sets up service properties for a service bundled named "system configuration."       <service_bundle type="profile" name="system configuration"> The service system/install/config is responsible for doing some of the basic configuration actions at install-time, such as setting up the first, or admin, user, setting a root password, and giving the system a name.  The property group "user_account" specifies how the first user, used for administration, should be configured.  You can specify here the username (name="login"), an encrypted password, the GECOS field information (name="description"), as well as the UID and GID for the account.  Note that the default password supplied for the first user (by default, named jack) in the default manifest is "jack".Special note should be made of the property "roles".  Recall that in Solaris 11 Express, root is no longer a regular login user, but becomes a role.  Therefore, in order to be able to assume the root role for administrative functions, this first user needs to be given the root role.  Other roles can also be specified here as needed.  Also notice that the profile "Primary Administrator" is no longer assigned to this first user, as was done in OpenSolaris.  Additional properties around roles, profiles, authorizations, etc. may be assigned.  See the Automated Installer Guide for details.        <service name="system/install/config" version="1" type="service">          <instance name="default" enabled="true">            <property_group name="user_account" type="application">              <propval name="login" type="astring" value="sewr"/>              <propval name="password" type="astring" value="9Nd/cwBcNWFZg"/>              <propval name="description" type="astring" value="default_user"/>              <propval name="shell" type="astring" value="/usr/bin/bash"/>              <propval name="uid" type='count' value='27589'/>              <propval name="gid" type='count' value='10'/>              <propval name="type" type="astring" value="normal"/>              <propval name="roles" type="astring" value="root"/>            </property_group>As with Jumpstart, it is possible to specify a root password at install time.  The encrypted string for the root password is given here as the password property.  If no new password is supplied, the default root password at install-time is "solaris".  Also note here that root is created as a role rather than a regular login user.            <property_group name="root_account" type="application">                <propval name="password" type="astring" value="$5$dnRfcZs$Hx4aBQ161Uvn9ZxJFKMdRiy8tCf4gMT2s2rtkFba2y4"/>                <propval name="type" type="astring" value="role"/>            </property_group>A few other housekeeping properties can also be set here for the system/install/config service.  These include the local timezone and the hostname (/etc/nodename) for the system.            <property_group name="other_sc_params" type="application">              <propval name="timezone" type="astring" value="US/Eastern"/>              <propval name="hostname" type="astring" value="myfavoritehostname"/>            </property_group>          </instance>        </service> The system/console-login service establishes the login service for the console. Here you can specify the terminal type to be used for the console.        <service name="system/console-login" version="1" type="service">          <property_group name="ttymon" type="application">            <propval name="terminal_type" type="astring" value="xterms"/>          </property_group>        </service>The service system/keymap establishes what sort of keyboard input is to be expected on the system.        <service name='system/keymap' version='1' type='service'>          <instance name='default' enabled='true'>            <property_group name='keymap' type='system'>              <propval name='layout' type='astring' value='US-English'/>            </property_group>          </instance>        </service>By default, Solaris 11 Express enabled NWAM (NetWork Automagic) to automatically configure the primary network interface.  NWAM, by default, activates a primary network for the system, whether wired or wireless, monitors its availability and tries to restore network connectivity if it should go away.  Most people would say that its behavior is best suited for mobile or desktop systems, and it functions well in that space.  It includes the ability to have profiles that guide its behavior in a variety of networked environments.  NWAM relies on DHCP to get an available IP address and other data needed to configure the network.In the default AI profile, the network/physical:nwam service instance is enabled and the network/physical:default service instance is disabled.  In most server configurations, static addressing and configuration might be more desirable.  In that case, you can do as we have below and switch with service instance is enabled and which is disabled by default.        <service name="network/physical" version="1" type="service">          <instance name="nwam" enabled="false"/>          <instance name="default" enabled="true"/>        </service> In the case where we are doing static network configuration, we will rely on the network/install service to set up our networks.  The properties and values used here correspond to arguments to the ipadm command, new in Solaris 11 Express.  ipadm is used to configure and tune IP interfaces.  See its man page for details on syntax.In this case, we are setting up a single IPv4 network interface (xnf0), giving it a static IP address and netmask, and specifying a default route.         <service name="network/install" version="1" type="service">          <instance name="default" enabled="true">            <property_group name="install_ipv4_interface" type="application">              <propval name="name" type="astring" value="xnf0/v4"/>              <propval name="address_type" type="astring" value="static"/>              <propval name="static_address" type="net_address_v4" value="192.168.100.101/24"/>              <propval name="default_route" type="net_address_v4" value="192.168.100.1"/>            </property_group>          </instance>        </service>As with Jumpstart using sysidcfg, it is possible to set up DNS information at install-time.  Note that only DNS and not NIS or LDAP naming services can be set up this way.  The System Administration Guide: Naming and Directory Services manual discusses how to configure these naming services.  NIS+ is no longer supported in Solaris 11 Express.The network/dns/install service is used to set up DNS at install-time.  For this, we specify the regular sorts of data that will populate the /etc/resolv.conf file: nameservers, domain, and a domain name search path.  Some of these data items take multiple values, so lists of values are used, as shown below.        <service name="network/dns/install" version="1" type="service">          <instance name="default" enabled="true">            <property_group name="install_props" type="application">              <property name="nameserver" type="net_address">                <net_address_list>                  <value_node value="1xx.xxx.xxx.zz"/>                  <value_node value="1xx.xxx.xxx.yy"/>                  <value_node value="1xx.xxx.xxx.xx"/>                </net_address_list>              </property>              <propval name="domain" type="astring" value="us.warble.com"/>              <property name="search" type="astring">                <astring_list>                  <value_node value="us.warble.com"/>                  <value_node value="garble.com"/>                  <value_node value="mfg.garble.com"/>                </astring_list>              </property>            </property_group>          </instance>        </service>And we close out the service bundle and the embedded SC manifest.      </service_bundle>      -->    </sc_embedded_manifest> So, by building a custom AI manifest with its embedded SC manifest, you can accomplish the same sorts of install-time configuration of a system as you could with Jumpstart and sysidcfg, without having to build any sort of complex finish scripts or any kind of extra coding.  This approach makes is possible to have a repeatable methodology for creating the administrative user, with known, standard credentials, and for configuring the base system networks and naming services.

In my last blog, I talked about how to configure a manifest for a bootable AI installation.  The main thing there was how to select which packages to install.  This time we are going to talk about...

Solaris

Configuring Bootable AI in Solaris 11 Express

This is the first of several blogs around bootable AI, the ability with the Solaris 11 Express automated installer to boot directly from the AI ISO, fetch an installation manifest, and act on it, without having to set up an AI install server. Most of this focuses on the manifest, so it applies to AI booted from the network and from the ISO. However, I do not plan to go into creating and configuring the AI network services, at least not right now.  I think that other folks have talked about this already. Solaris 11 Express includes the Automated Installer as the tool for performing automated network system installations. Using AI, it is possible to install a customized load of Solaris 11 Express, across the network, without manual intervention. AI allows you to specify which repositories to use, which packages to install from the repositories, and how to handle the initial system configuration (like sysidcfg) for the system. Generally speaking, AI relies on a network automated install service to answer requests from a client trying to install it self. This is sort of like the jumpstart approach. The client starts to boot, looks around to see who on the network can help out, fetches what it needs from that network server, and goes ahead with the installation.But, getting the base install information from the network isn't always feasible or even the most expeditious path. So, AI has a feature called "Bootable AI". I've blogged about this before, when it first came out in OpenSolaris.The idea of bootable AI is that rather than rely on a network service to fetch the needed information to boot, you boot from the AI media instead. During the boot process, you can be prompted for a URL where an installation manifest can be found. The client fetches this manifest and carries on, just as it would from the AI service. The upside to this is that no AI service has to be installed on the network. The downside is that it does require at least one interaction if you want to specify a non-default manifest.The manifest is the XML specification of how and what to install on the system. There is quite a lot that can be done in terms of selecting which disks to use, how to partition them, etc. Check the Oracle Solaris 11 Express Automated Installer Guide for detailed information on this.In this note, I am going to show you how to put together some of the key parts of the manifest. My main goal is to then use this with bootable AI, but the same manifest can be installed with an install service in an AI server. Locating a default manifestHow to get started? The simplest way to make a manifest is to start with the default manifest, delivered in the AI iso, and modify it to suit your needs. A default manifest is located in auto_install/default.xml in the AI ISO. Copy this and modify it as needed.Selecting "Server Install" packages When Solaris 11 Express is installed via the LiveCD or via the default manifest in AI, a full, desktop version of Solaris is installed. Often, however, when you use AI, you would prefer to have the smaller server installation provided by the text installer. Since the manifest specifies which top-level packages to install, this is easily accomplished. In the default manifest, look for the software_data section with the install action. This section specifies what packages are to be installed. The two packages listed here are group packages, sort of like package clusters in Solaris 10. entire and babel_install are the packages that, when installed, provide the environment installed from the LiveCD. In order to get a reduced installation like that from the text installer, replace babel_install with server_install. If there are other packages that you want to add to the installation (for example the iSCSI packages referenced in the comments), you can add them here. Change this section: <software_data action="install" type="IPS">        <name>pkg:/entire</name>        <name>pkg:/babel_install</name> to this section: <software_data action="install" type="IPS">        <name>pkg:/entire</name>        <name>pkg:/server_install</name> Uninstalling the appropriate packages The server_install package bundle has dependencies of the packages that make up the reduced server installation. By installing it, we get all of the other packages that come with it. That's part of the coolness of IPS. However, we also want to preserve the ability to uninstall or modify individual components of that overall bundle. So, we finish out our installation by uninstalling the server_install wrapper. This does not affect the dependent packages; it just unwraps them so we can modify them directly. So, to do this, update the uninstall section as below. Additionally, even with the reduced server installation, there may still be packages that we want to remove. For example, there are still over 700MB of system locales installed that you may not need and might choose to remove. You can add any other packages that you want to remove in this section as well. Note that this really does first install the package and then remove it. Seems sort of redundant, but I have not yet found a way to cause IPS to build a plan that would note the uninstalled packages and just mark them to be skipped during installation. Change this: </software_data>      <!--        babel_install and slim_install are group packages used to        define the default installation.  They are removed here so        that they do not inhibit removal of other packages on the        installed system.        -->      <software_data action="uninstall" type="IPS">        <name>pkg:/babel_install</name>        <name>pkg:/slim_install</name></software_data> to this: <software_data action="uninstall" type="IPS"> <name>pkg:/server_install</name> <name>pkg:/system/locale/af</name> <name>pkg:/system/locale/ar_eg</name> <name>pkg:/system/locale/as</name> <!-- ... --></software_data> Any other packages that you want to uninstall can be listed here, too. So, you see how easy it is to build a manifest for AI that specifies which packages you want to include or exclude and how to create a smaller, server installation for Solaris 11 Express.

This is the first of several blogs around bootable AI, the ability with the Solaris 11 Express automated installer to boot directly from the AI ISO, fetch an installation manifest, and act on it,...

Solaris

Installing OpenSolaris in Oracle VM for x86

Yeh, Virginia, you can install OpenSolaris as a guest in an Oracle VM for x86 environment!  Here is a little step-by-step guide on how to do it.  I am going to assume that you already have downloaded and installed Oracle VM Server.  Further, I am going to assume that you have already created an Oracle VM Manager system to manage your OVM farm.  All of that is more than I can tackle today.  But, to get you started, you can fetch the OVM am VM Manager bits from http://edelivery.oracle.com/linux .  I found the simplest thing to do was to create an Oracle Enterprise Linux management system and install the VM Manager there. Once you have the OVM environment established, you need to get some installation media to install the guests.  For OpenSolaris, here's the magic: Bootable AI.  Check out Alok's blog for more details on exactly what the Bootable AI project is about.  But in a nutshell, this make it so that you can install OpenSolaris as if you were using a network install server, but while you are booted from installation media.  This gets around the difficulty of trying to do an installation using a LiveCD in a tiny VNC window and the difficulty of trying to get a network, PXE-based installation working.  This is a quick and easy way to go. Fetch the OpenSolaris AI iso rather than the regular LiveCD iso.  Install this into the VM Manager resource repository. (Remember, I assum you know how to do this, or can figure it out pretty easily.  I did.) Now, create a VM just as you always would for Oracle Enterprise Linux, Solaris, Windows, or whatever.  Select Install from installation media, and use the iso that you just added to the repostiory.  When you specify what operating system this VM will run, select "Other" since it isn't one of the pre-defined choices.  Start the creation and away you go. As you have already figured out from Alok's blog, this is only half of the story.  You still must create an AI manifest.  The manifest details which packages to install and from where, along with the details for the first user created, root password, etc.  Check out the Automated Installation Project page for details on this.  The docs are pretty good and the minimum manifest needed for bootable AI is pretty basic.  Alok talks about how to specify booting from the development repository.  That was the only change to the default manifest that I made. Put this manifest somewhere accessible via http from the VM you want to create.  The VM you created is sitting, waiting for you to tell it where to fetch its manifest so it can boot.  You really don't want to keep it waiting much longer. Connect to the VM using VNC. You can use the built-in VNC client in VM Manager or whatever VNC client you like best. I tend to use vncviewer because it seems to manage the screen resolutions better than the Java client. When the installer prompts you for the manifest, enter the URL for the manifest you just made. The installer will fetch it, validate it, and then go on with its usual installation using that manifest. This is so simple and so cool! Installation proceeds like it would with an install server.  You can log in on the console of the system being installed and monitor is progress.  Then, when it's done, reboot and you are done. One note:  I have run into difficulting with OpenSolaris b133 and this approach.  When I used the b133 iso, even though I never got an error, the resulting VM was not bootable.  (No, I haven't got around to filing a bug on this.  Was going to wait until b134.)  However, if I use the b131 iso and a manifest that referenced installing entire@0.5.11-0.133 things worked out just fine.  So, give that a shot. Once you have created a VM that you like within Oracle VM, you can do all of the cool Oracle VM things - convert it to a template and stamp out lots of copies, move it from server to server, etc.  But that's for another day.  Or that's something to look for in Honglin Su's blog.

Yeh, Virginia, you can install OpenSolaris as a guest in an Oracle VM for x86 environment!  Here is a little step-by-step guide on how to do it.  I am going to assume that you already have downloaded...

Everything Else

Random movie thoughts on a momentuous day

So, the EU has approved the acquisition of Sun by Oracle.  This coming Monday makes 15 years that I have been at Sun.  And this morning two movie scenes keep rolling around in my head.First is from 1776, the great old musical about how the US came to be.  Right after the Continental Congress unanimously votes, after a long struggle, for independence from England, they all sit stunned that it has passed and John Adams, played by William Daniels just says, "It's done.  It's done."  They really didn't know what was in store, but they were confident of a bright future, albeit one that would require a huge effort on everyone's part.  (Just before this, he sang one of my favorite songs about commitment and pushing forward.)The second is from Camelot.  I suspect a lot of us long-time Sun folks are feeling just a tad nostalgic right now for the place where "the rain would never fall till after sundown."  For many, Sun has been "for one brief shining moment" a really special place to be.  I've always thought of the people at Sun first when I think about this company.  Some of the finest, smartest, most willing to pitch in and help each other no matter what folks you will ever find.Scott McNealy said it best in his keynote at Oracle OpenWorld this past October.  He said he wanted people to remember that Sun Kicked Butt, Had Fun, Didn't Cheat, Loved Our Customers, and Changed Computing Forever.   That about sums it up.Now, the next chapter in Sun is about to start.  I think it's going to be bright with lots of opportunity and a great time all around.  But, no matter how wonderful it is, it will be different and we'll look back and miss many of the good times at Sun.I'm excited about the next chapter.

So, the EU has approved the acquisition of Sun by Oracle.  This coming Monday makes 15 years that I have been at Sun.  And this morning two movie scenes keep rolling around in my head. First is from 177...

Solaris

Bootable AI ISO is way cool

Alok Aggarwal posted, just before Christmas, a blog mentioning that the ISO images for the Auto Installer in OpenSolaris are now bootable.  Not just for x86 but also for SPARC. This is huge!  While it does not provide a LiveCD desktop environment for SPARC, it does give us a way to easily install OpenSolaris on  SPARC gear.  Previously, it was necessary to set up an AI install server (running on an x86 platform since that was the only thing you could install natively) and use WAN Boot to install OpenSolaris on the SPARC boxes.  Well, that was a tough hurdle for some of us to get over. Now, you can burn the AI ISO to a CD and boot it directly.  The default manifest on the disk will install a default system from the pkg.opensolaris.org  release repository.   Or, better yet, build a simple AI manifest that changes the release repository to the dev repo and put it somewhere you can fetch via http.  When you boot up, you will be prompted for the URL of the manifest.  AI will fetch it and use it to install the system. {2} ok boot cdrom - install promptResetting ...Sun Fire 480R, No KeyboardCopyright 1998-2003 Sun Microsystems, Inc. All rights reserved.OpenBoot 4.10.8, 16384 MB memory installed, Serial #57154911.Ethernet address 0:3:ba:68:1d:5f, Host ID: 83681d5f.Rebooting with command: boot cdrom - install promptBoot device: /pci@8,700000/ide@6/cdrom@0,0:f File and args: - install promptSunOS Release 5.11 Version snv_130 64-bitCopyright 1983-2009 Sun Microsystems, Inc. All rights reserved.Use is subject to license terms.Hostname: opensolarisRemounting root read/writeProbing for device nodes ...Preparing automated install image for useDone mounting automated install imageConfiguring devices. Enter the URL for the AI manifest [HTTP, default]: http://<my web server>/bootable.xml See!  This is really easy and gives new life to really old gear.  In this case, the manifest is super simple, too.  I just grabbed the default manifest from an AI image and changed the repository and package to install. $ pfexec lofiadm -a `pwd`/osol-dev-130-ai-x86.iso/dev/lofi/1$ pfexec mount -o ro -F hsfs /dev/lofi/1 /mnt$ cp /mnt/auto_install/default.xml /etc/apache2/2.2/htdocs/bootable.xml Edit this file and change <main url="http://pkg.opensolaris.org/release" publisher="opensolaris.org"/> to <main url="http://pkg.opensolaris.org/dev" publisher="opensolaris.org"/> Or as a speedup, add the mirror to pkg.opensolaris.org : <main url="http://pkg.opensolaris.org/dev" publisher="opensolaris.org"/><mirror url="http://pkg-na-2.opensolaris.org/dev"/> And change <pkg name="entire"/> to <pkg name="entire@0.5.11-0.130"/> You can add a mirror site for the repo in this manifest.  Or you can list other packages that you want to be installed as the system is installed.  The docs for the AutoInstaller talk about how to create and modify a manifest. Some caveats that I found:  First, NWAM and DHCP might take longer than you think.  If you quickly try to type in the URL for the manifest, you may find that you have no network yet and become concerned.  I spent the better part of a day on this.  Then, I let it sit for a couple of minutes before trying the manifest URL and life was good.  My DHCP server is particularly slow on my network. Second, not using the mirror, on a slow system took a really long time to install.  Have not diagnosed it to network download time or processing time.  I think some of both since things like the installation phase of babel_install took nearly an hour on one system. Third, there must be a lower bound on what sort of system will work.  T2000 works just fine.  SF480R has worked fine.  My SF280R is busted - as soon as it's fixed, I'll try it.  Not so great on E220 and E420 systems.  They appear to work, but at the very end it says it failed.  The only failure message I can see this time is due to the installer finding a former non-global zone environment on the disk. But so far, my experience on UltraSPARC-II systems is that once the installation completes, it hangs on the first reboot or fails to boot at all.  I am not surprised that systems that are no longer supported are not supported by AI.  I think I saw in Alok's notes that OBP 4.17 was the minimum supported.  That means my USII boxes are right out, and  I think even the SF280.  I hate doing firmware updates, so I have not updated the SF480. Fourth, when I tried to install on a system that previously had the root disk mirrored with SVM, zpool create for the root pool failed.  I had to delete the metadbs and the metadevices before I could proceed. But, I am very impressed!  Bootable AI media is way cool.  Keep your eyes and ears open, though, for more developments in the AutoInstaller in the coming months.

Alok Aggarwal posted, just before Christmas, a blog mentioning that the ISO images for the Auto Installer in OpenSolaris are now bootable.  Not just for x86 but also for SPARC. This is huge!  While it...

Solaris

Sillyt ZFS Dedup Experiment

Just for grins, I thought it would be fun to do some "extreme" deduping.  I started out created a pool from a pair of mirrored drives on a system running OpenSolaris build 129.  We'll call the pool p1.  Notice that everyone agrees on the size when we first create it.  zpool list, zfs list, and df -h all show 134G available, more or less.  Notice that when we created the pool, we turned deduplication on from the very start.# zpool create -O dedup=on p1 mirror c0t2d0 c0t3d0# zfs list p1NAME   USED  AVAIL  REFER  MOUNTPOINTp1      72K   134G    21K  /p1# zpool list p1NAME   SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOTp1     136G   126K   136G     0%  1.00x  ONLINE  -# df -h /p1Filesystem             size   used  avail capacity  Mounted onp1                     134G    21K   134G     1%    /p1So, what if we start copying a file over and over?  Well, we would expect that to dedup pretty well.  Let's get some data to play with.  We will create a set of 8 files, each one being made up of 128K of random data.  Then we will cat these together over and over and over and over and see what we get.Why choose 128K for my file size?  Remember that we are trying to deduplicate as much as possible within this dataset.  As it turns out, the default recordsize for ZFS is 128K.  ZFS deduplication works at the ZFS block level.  By selecting a file size of 128K, each of the files I create fits exactly into a single ZFS block.  What if we picked a file size that was different from the ZFS block size? The blocks across the boundaries, where each file was cat-ed to another, would create some blocks that were not exactly the same as the other boundary blocks and would not deduplicate as well.Here's an example.  Assume we have a file A whose contents are "aaaaaaaa", a file B containing "bbbbbbbb", and a file C containing "cccccccc".  If our blocksize is 6, while our files all have length 8, then each file spans more than 1 block.# cat A B C > f1# cat f1aaaaaaaabbbbbbbbcccccccc111111222222333333444444# cat B A C > f2# cat f2bbbbbbbbaaaaaaaacccccccc111111222222333333444444The combined contents of the three files span across 4 blocks.  Notice that the only block in this example that is replicated is block 4 of f1 and block 4 of f2.  The other blocks all end up being different, even though the files were the same.  Think about how this would work as files numbers of files grew.So, if we want to make an example where things are guaranteed to dedup as well as possible, our files need to always line up on block boundaries (remember we're not trying to be a real world - we're trying to get silly dedupratios).  So, let's create a set of files that all match the ZFS blocksize.  We'll just create files b1-b8 full of blocks of /dev/# zfs get recordsize p1NAME  PROPERTY    VALUE    SOURCEp1    recordsize  128K     default# dd if=/dev/random bs=1024 count=128 of=/p1/b1# ls -ls b1 b2 b3 b4 b5 b6 b7 b8 257 -rw-r--r--   1 root     root      131072 Dec 14 15:28 b1 257 -rw-r--r--   1 root     root      131072 Dec 14 15:28 b2 257 -rw-r--r--   1 root     root      131072 Dec 14 15:28 b3 257 -rw-r--r--   1 root     root      131072 Dec 14 15:28 b4 257 -rw-r--r--   1 root     root      131072 Dec 14 15:28 b5 257 -rw-r--r--   1 root     root      131072 Dec 14 15:28 b6 257 -rw-r--r--   1 root     root      131072 Dec 14 15:28 b7 205 -rw-r--r--   1 root     root      131072 Dec 14 15:28 b8 Now, let's make some big files out of these.# cat b1 b2 b3 b4 b5 b6 b7 b8 > f1# cat f1 f1 f1 f1 f1 f1 f1 f1 > f2# cat f2 f2 f2 f2 f2 f2 f2 f2 > f3# cat f3 f3 f3 f3 f3 f3 f3 f3 > f4# cat f4 f4 f4 f4 f4 f4 f4 f4 > f5# cat f5 f5 f5 f5 f5 f5 f5 f5 > f6# cat f6 f6 f6 f6 f6 f6 f6 f6 > f7# ls -lhtotal 614027307-rw-r--r--   1 root     root        128K Dec 14 15:28 b1-rw-r--r--   1 root     root        128K Dec 14 15:28 b2-rw-r--r--   1 root     root        128K Dec 14 15:28 b3-rw-r--r--   1 root     root        128K Dec 14 15:28 b4-rw-r--r--   1 root     root        128K Dec 14 15:28 b5-rw-r--r--   1 root     root        128K Dec 14 15:28 b6-rw-r--r--   1 root     root        128K Dec 14 15:28 b7-rw-r--r--   1 root     root        128K Dec 14 15:28 b8-rw-r--r--   1 root     root        1.0M Dec 14 15:28 f1-rw-r--r--   1 root     root        8.0M Dec 14 15:28 f2-rw-r--r--   1 root     root         64M Dec 14 15:28 f3-rw-r--r--   1 root     root        512M Dec 14 15:28 f4-rw-r--r--   1 root     root        4.0G Dec 14 15:28 f5-rw-r--r--   1 root     root         32G Dec 14 15:30 f6-rw-r--r--   1 root     root        256G Dec 14 15:49 f7This looks pretty weird.  Remember our pool is only 134GB big.  Already the file f7 is 256G and we are not using any sort of compression.  What does df tell us?# df -h /p1Filesystem             size   used  avail capacity  Mounted onp1                     422G   293G   129G    70%    /p1Somehow, df now believes that the pool is 422GB instead of 134GB.  Why is that?  Well, rather than reporting the amount of available space by subtracting used from size, df now calculates its size dynamically as the sum of the space used plus the space available.  We have lots of space available since we have many many many duplicate references to the same blocks.# zfs list p1NAME   USED  AVAIL  REFER  MOUNTPOINTp1     293G   129G   293G  /p1# zpool list p1NAME   SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOTp1     136G   225M   136G     0%  299594.00x  ONLINE  -zpool list tells us the actual size of the pool, along with the amount of space that it views as being allocated and the amount free.  So, the pool really has not changed size.  But the pool says that 225M are in use.  Metadata and pointer blocks, I presume.Notice that the dedupratio is 299594!  That means that on average, there are almost 300,000 references to each actual block on the disk.One last bit of interesting output comes from zdb.  Try zdb -DD on the pool.  This will give you a histogram of how many blocks are referenced how many times.  Not for the faint of heart, zdb will give you lots of ugly internal info on the pool and datasets.  # zdb -DD p1DDT-sha256-zap-duplicate: 8 entries, size 768 on disk, 1024 in coreDDT histogram (aggregated over all DDTs):bucket              allocated                       referenced          ______   ______________________________   ______________________________refcnt   blocks   LSIZE   PSIZE   DSIZE   blocks   LSIZE   PSIZE   DSIZE------   ------   -----   -----   -----   ------   -----   -----   -----  256K        8      1M      1M      1M    2.29M    293G    293G    293G Total        8      1M      1M      1M    2.29M    293G    293G    293Gdedup = 299594.00, compress = 1.00, copies = 1.00, dedup \* compress / copies = 299594.00 So, what's my point?  I guess the point is that dedup really does work.  For data that has a commonality, it can save space.  For data that has a lot of commonality, it can save a lot of space.  With that come some surprises in terms of how some commands have had to adjust to changing sizes (or perceived sizes) of the storage they are reporting.My suggestion?  Take a look at zfs dedup.  Think about where it might be helpful.  And then give it a try!

Just for grins, I thought it would be fun to do some "extreme" deduping.  I started out created a pool from a pair of mirrored drives on a system running OpenSolaris build 129.  We'll call the pool...

Solaris

Quick Review of Pro OpenSolaris

Pro OpenSolaris - Harry Foxwell and Christine TranSeveral (too many) weeks ago, I said that I was going to read and review Harry & Christine's new book, Pro OpenSolaris. Finally, I am getting around to doing this.Overall, I was pleased with Pro OpenSolaris.  It does a good job at what it tries to do.  The key is to recognize when it is the right text and when others might be the right text.  Right in the Introduction, the authors are clear that this is an orientation tour.  They say "We assume that you are a professional system administrator ... and that your learning style needs only an orientation and in indication of what should be learned first in order to take advantage of OpenSolaris."  That's a good summary of the main direction of the book.  And at this, it does a very nice job!This means that Pro OpenSolaris is not an exhaustive reference manual on all of the features and nuances of OpenSolaris.  Instead, it's a broad overview of what OpenSolaris is, how it got to be what it is, what is key features and differentiators are, and why I might choose to use OpenSolaris instead of some other system.  That's important to realize from the outset.  If you are looking for the thousand-page reference guide, this is not the one.  If you have heard about OpenSolaris and want to explore a bit more deeply, to decide whether or not OpenSolaris is something that might help your business or might be a tool you can use, this is a great place to start.Pro OpenSolaris spends a good bit of time on the preliminaries.  There is an extensive section on the philosophical differences between the approaches and requirements of different open source licenses and styles of licenses.  Pro OpenSolaris explains clearly why OpenSolaris uses the CDDL license as opposed to other licenses and how this fits in with the overall goal of the OpenSolaris project.Pro OpenSolaris helps you get started, with a lengthy discussion of how to go about installing OpenSolaris either on  bare metal or in a virtual machine.Compare this to the OpenSolaris Bible (Solter, Jelinek, & Miner), which really does aspire to be the thousand-page reference guide.  In the OpenSolaris Bible, licensing and installation are given only a short discussion, since they are not central to the book's focus.  Instead, the reader is directed to other places for that discussion.But that's why it's important to have both books.  Pro OpenSolaris gives the tour of the important parts of the OpenSolaris operating system, how and why I might use them, and why they are important, but it does not go deeply into the details.  That's probably wise for an operating system that is still growing and changing substantially with each new release.One thing that particularly interested me in Pro OpenSolaris was the fact that it includes large sections on both the OpenSolaris Webstack which includes IPS-packaged versions of the commonly used pieces of an AMP stack - notably, Apache, MySQL, PHP, lighttpd, nginx, Ruby, Rails, etc - all compiled and optimized for OpenSolaris and including key add-ons such as DTrace providers where applicable.  Pro OpenSolaris also has a nice, long chapter on NetBeans and its role as a part of an overall OpenSolaris development environment.What's my take overall?  Pro OpenSolaris is a quick read that will give you a good understanding of what OpenSolaris is and why you would want to use it; what it's key features are and why they are important; and how you can use these to your best advantage.  There are lots of examples and technical details so that you can see that what Harry & Christine talk about is for real.  I would recommend this as part of your library.  But I would also recommend the OpenSolaris Bible.  The two complement each other nicely to complete the picture.

Pro OpenSolaris - Harry Foxwell and Christine TranSeveral (too many) weeks ago, I said that I was going to read and review Harry & Christine's new book, Pro OpenSolaris. Finally, I am getting around...

Solaris

OpenSolaris User Group Leaders Bootcamp

The keepers of the OpenSolaris Community took advantage of having a number of the User Group leaders at the CommunityOneconference this last week to set aside a day for a User Group Leaders' Bootcamp.What a great opportunity to get together in the same room with folks working to create and sustain OpenSolaris user groupsaround the world! We had folks from every continent - from Atlanta and Argentina, from Dallas and Serbia, from China and London,and on and on. Something like twenty-five to thirty of the OpenSolaris User Groups were represented.The whole day was a great experience. It was great to see that as different as each group was, there were a lot of commonthemes for both successes and for challenges. And a lot of great ideas were shared as to how to boost participation, toimprove meetings, and to improve the success of the groups overall. It will be exciting to hear a report back next year onhow these ideas have played out.Be sure to check out Jim Grisanzio's photos to see some of these characters and what all went on at CommunityOne and in the OSUG Bootcamp. Jeff Jackson, Sr. VP for Solaris Engineering, started the day off with a greeting and charge to get the most out of this opportunity to meet with each other and with the OpenSolaris and Solaris headquarters teams.Since the thing that brought this group together was a common focus on OpenSolaris User Groups and not the fact that weknew each other, we began the day with a bit of team-building exercise, courtesy of The Go Game. This is a cross betweena scavenger hunt and an improvisational acting class. Teams criss-crossed downtown San Francisco trying to find and photographplaces hinted at by clues on web pages. At some venues, the teams had to act out and film various tasks. For example, on the Yerba Buena lawn, the team had to engage in an impromptu Tai Chi exercise in order to find their long-lost phys ed teacher, Ms. Karpanski, who then led the team in creating a new exercise video. Once we all returned, all of our submissions werevoted on by the team and a winning team chosen. Supposedly, we can see all these photos and videos. Haven't yet found out how.Perhaps, that's for the best!In order for us to get to know each other's groups, each User Group prepared a poster describing the group, where we were located,what we do, what sort of members make up the group, and what makes us special. Many of these posters were really well done! We had a bit of a scavenger hunt for answers to questions found by careful reading of all of the posters. It was really cool to see what sorts of projects some of the groups had undertaken and how they were working with various university or other organizations. But the main part of the day was spent in a big brainstorming session. We all identified our successes, our failures, our challenges, and ideas for the future. We put all of these on several hundred post-it notes and placed them on large posters. We grouped them by topic and then went through all of these. Even though this only had an hour on the agenda, it ended up taking the bulk of the day. Since this was the most important thing for us, we decided to rearrange the dayto accommodate it.From these sticky-notes, we found out that some of our groups were mostly focused on administrators but others had a large developer population. We all have some sort of issues around meeting locations - whether it's a matter of access in theevening, finding a convenient location, or providing network access and power. For most groups, having some sort of refreshments was important, though some groups felt like good refreshments attracted too many folks who just show up for the food. There were a lot of good ideas around using a registration site to get access to the facility and order food, creating and using Facebook, LinkedIn, and Twitter, using IRC, interacting with the Sun Campus Ambassadors, using MeetUp to find newmembers. Many folks found it usefulto video and make available presentations given at their meetings. Some groups (for example in Japan) have special sub-groupsfor beginners. Other groups are doing large-scale development projects, such as the Belenix project in Bangalore.For me and the Atlanta OpenSolaris User Group, I have a lot of new ideas that I want to put out to our membership and ourleaders - move back to monthly meetings, use a registration site, set up a presence on various social networks.Many people said that folks come to the user groups in order to network and expand their circle of business acquaintances. In lightof the current economic situation, with so many smart people out of work, I am thinking of promoting our group with some of the job networking groups around Atlanta. For example, my church, Roswell United Methodist Church, has one of the largest job networking groups in the Atlanta area. Every two weeks, nearly 500 people meet to network and helpeach other in their job search. Perhaps the many IT folks in this group might find this a way to get current and stay currentin a whole new area.At any rate, I am inspired to get things cranking at ATLOSUG! After spending the afternoon working through our hundreds of sticky notes, the OpenSolaris Governing Board had a bit of aroundtable with us to talk about what they do and how we can work better together. It was really helpful for me to hearfrom them and to get to put faces to some of the names for the folks I did not already know. We finished out the evening with a great dinner at the Crab House at Pier 39. From what I have seen, many of the photos fromdinner and the meeting are already on Facebook, Flickr, and likely blogs.sun.com. Jim Grisanzio, OpenSolaris Chief Photographer, was out in force with his camera! Thanks so much to Teresa Giacomini, Lynn Rohrer, Dierdre Straughan, Jim Grisanzio, Tina Hartshorn, Wendy Ames, Kris Hake and everyone else who had a hand in organizing this event. Thanks to Jeff Jackson, Bill Franklin, Chris Armes, Dan Roberts and all the other HQ folks who took the time to come and listen and interact with the leaders of these groups. I know that I got a lot out of the meeting and am more eager than ever to promote and push forward with our user group.

The keepers of the OpenSolaris Community took advantage of having a number of the User Group leaders at the CommunityOneconference this last week to set aside a day for a User Group Leaders'...

Solaris

CommunityOne Recap

Last week, I had the opportunity to attend CommunityOne West in San Francisco, along with a number of the other leaders of OpenSolaris User Groups. (I head up the Atlanta OpenSolaris User Group.) What a great meeting! Three days of OpenSolaris. First off, I am sure that Teresa and the OpenSolaris team selected the Hotel Mosser because they knew it was a Solaris focused venue. As Dave Barry would say, I am not making this up! Even the toilet paper was Solaris-based.Bob Netherton and I were speculating that perhaps this was an example of Solaris Roll-Based Dump Management, new in OpenSolaris 2009.06. CommunityOne Day One Day One was a full day of OpenSolaris and related talks. The OpenSolaris teams maintained tracks around deploying OpenSolaris 2009.06 in the datacenter and around developing applications on OpenSolaris 2009.06. For the most part,I stuck with the operations-focused sessions, though I did step out into a few others. Some of the highlights included: Peter Dennis and Brian Leonard's fun survey of what's new and exciting in OpenSolaris 2009.06. ATLOSUG folks should look for a reprise of thisat our meeting on Tuesday. Jerry Jelinek's discussion of the various virtualization techniques built into and onto OpenSolaris. This is a sort of talk that I give a lot. It was really helpful to hear how the folks in engineering approach this topic. Scott Tracy & Dan Maslowski's COMSTAR discussion and demo. COMSTAR has been significantly expanded in recent builds, with more coolness still to come. I had not paid a lot of attention to this lately and this was a really helpful talk, especiallysince Teresa Giacomini had asked me to present this demo for the user group leaders on Wednesday. In any case, I have reproducedthe iSCSI demo that Scott did using just VirtualBox, rather than requiring a server. Of course, the VB version is not something I would run my main storage server on. But it certainly is a great tool to understand the technology. I hope to have Ryan Matteson (Ryan, you volunteered!) give a talk at the ATLOSUG sometime soon. I branched out of main OpenSolaris path to see a few other things on Day One, as well. Ken Pepple, Scott Mattoon, and John Stanford gave a good talk on Practical Cloud Patterns. They talked about some of the typical waysthat people do provisioning, application deployment, and monitoring within the cloud. Karsten Wade, "Community Gardener" at Red Hat, gave a talk called Participate or Die. This was about theimportance of participating in the Open Source projects that are important to your business. He talked about understandingthe difference in participating (perhaps, using open source code) and influencing (helping to guide the project). By paying more attention to those who actively participate, active members of the community enhance their status and becomeinfluencers of the direction for a project. And it is important that this happen - in successful projects, the roadmap is driven by the participants rather than handed down on high with the hope that people will line up behind it.Really, I think, his key message was that it is important not to just passively stand by when you care about or depend upon something, leaving its future in the hands of others. Kevin Nilson and Michael Van Riper gave a great talk about building and maintaining a successful user group. This was built on their experiences withthe Silicon Valley Java User Group and with the Google Technology User Group. They took a great approach by collecting videos from the leaders, hosts, and participants in these and other groups around the country. It was really helpfulto hear people's perspectives on why they attend a group, why companies host group meetings, and why and how people continueto lead user groups. While a lot of what they had to say, and the successes that they have had, are a product of being in a very "target-rich environment" in Silicon Valley, it was interesting to see that some things are universal: a good location makes a lot of difference; having food matters. I got a lot of ideas from this and from the OpenSolaris User Group Bootcamp that I hope to get going in ATLOSUG. OpenSolaris 2009.06 Launch Party finished out the evening. Dodgeball and the Extra Action Marching Band.I thought these folks were the hit of the evening. You get the best of marching bands, big drums, loud brass, but add to that folks flaying around, throwing themselves at the dodgeball court nets. Much more exciting than your regular marching band, even some of the cool ones around Atlanta in the Battle of the Bands! CommunityOne Day Two Day Two was filled with OpenSolaris Deep Dives. These were very helpful, not just in content, but in helping me to hone my own OpenSolaris presentations. For this day, I stuck close to the Deploying OpenSolaris track, having learned in graduate school that I am not a developer. This track included: Chris Armes kicked off the day with a talk on deploying OpenSolaris in your Data Centre (as he spells it). Becoming a ZFS Ninja, presented by Ben Rockwood. Ben is an early adopter and a production user of ZFS. This was a two-hour,fairly in-depth talk about ZFS and its capabilities. Nick Solter, co-author of the OpenSolaris Bible, talked about OpenHA Cluster, newly released and available for OpenSolaris. With OpenHA, enterprise-levelavailability is not just available, but also supported. He talked about how the cluster works and about extensions tothe OpenHA cluster beyond the capabilities of Solaris Cluster, based on OpenSolaris technologies. Some of these include the use of Crossbow VNICs for private interconnects. I am still thinking about the availability implications of this and am not sure it's an answer for all configurations. But it's cool that it's there! Jerry Jelinek rounded out the day talking about Resource Management with Containers, a topic near and dear to my heart and one I end up presenting a lot. We finished out Day Two with a reunion dinner of some of the old team at Bucca di Beppo. Around the table, we had Vasu Karunanithi, Dawit Bereket, Matt Ingenthron, Scott Dickson (me), Bob Netherton, Isaac Rosenfeld, and Kimberly Chang. It was great to get at least part of the oldgang together and catch up. Day Three was the OpenSolaris User Group Leaders Bootcamp. But that's for another post....

Last week, I had the opportunity to attend CommunityOne West in San Francisco, along with a number of the other leaders of OpenSolaris User Groups. (I head up the Atlanta OpenSolaris User Group.)What...

Solaris

DTrace and Performance Tools Seminar This Week - Atlanta, Ft. Lauderdale, Tampa

I'm doing briefings on DTrace and Solaris Performance Tools this week in Atlanta, Ft. Lauderdale, and Tampa.  Click the links below to register if this is of interest and you can attend.  These are pretty much a 2 1/2 to 3 hour briefing that stays pretty technical with lots of examples.   From the flyer:Join us for our next Solaris 10 Technology Brief featuring DTrace. DTrace, Solaris 10's powerful new framework for system observability, helps system administrators, capacity planners, and application developers improve performance and problem resolution. DATE: May 12, 2009LOCATION: Classroom Resource Group, AtlantaTIME: 8:30 AM Registration, 9:00 am - 12:00 pm SessionDIRECTIONS: http://www.crgatlanta.com/directions.aspREGISTER AT: http://www.suneventreg.com/cgi-bin/pup_registration.pl?EventID=2705HOLLYWOOD, FL - May 13, 2009LOCATION: Seminole Hardrock HotelTIME: 8:30 AM Registration, 9:00 am - 12:00 pm SessionDIRECTIONS: http://www.seminolehardrockhollywood.com/getting_here/directions.phpREGISTER: http://www.suneventreg.com/cgi-bin/pup_registration.pl?EventID=2706TAMPA, FL - May 14, 2009LOCATION: University of South FloridaTIME: 8:30 AM Registration, 9:00 am - 12:00 pm SessionDIRECTIONS: http://www.msc.usf.edu/directions.htm REGISTER: https://www.suneventreg.com//cgi-bin/register.pl?EventID=2707What You'll Learn?You can't improve what you can't see and DTrace provides safe, production-quality, top to bottom observability - from the PHP application scripts down to the device drivers - without modifying applications or the system. This seminar will introduce DTrace and the DTrace Toolkit as key parts of an overall Solaris performance and observability toolkit. AGENDA:8:30 AM To 9:00 AM Check In, Continental Breakfast9:00 AM To 9:10 AM Welcome9:10 AM To 10:15 AM Dtrace10:15 AM To 10:30 AM BREAK10:30 AM To 11:30 AM Dtrace Continued11:30 AM To 12:00 PM Wrap Up, Q&A, EvaluationsWe look forward to seeing you at one of these upcoming Solaris 10 Dtrace sessions!

I'm doing briefings on DTrace and Solaris Performance Tools this week in Atlanta, Ft. Lauderdale, and Tampa.  Click the links below to register if this is of interest and you can attend.  These are...

Solaris

How do you use Jumpstart?

Jumpstart is the technology within Solaris that allows a system to be remotely installed across a network. This feature has been in the OS for a long, long time, dating to the start of Solaris 2.0, I believe. With Jumpstart, the system to beinstalled, the Jumpstart client, contacts a Jumpstart server to be installed across the network. This is a huge simplification, since there are nuances to how to set all of this up. Your best bet is to check the Solaris 10 Installation Guide:Network Based Installations and the Solaris 10 Installation Guide: Custom Jumpstart and Advanced Installations.Jumpstart makes use of rules to decide how to install a particular system, based on its architecture, network connectivity, hostname,disk and memory capacity, or any of a number of other parameters. The rules select a profile that determines what will beinstalled on that system and where it will come from. Scripts can be inserted before and after the installation for further customization. To help manage the profiles and post-installation customization, Mike Ramchand has produced a fabulous tool,the Jumpstart Enterprise Toolkit (JET). My Questions for YouAs a long time Solaris admin, I have been a fan of Jumpstart for years and years. As an SE visiting many cool companies, I have seen people do really interesting things with Jumptstart. I want to capture how people use Jumpstart in the real world - not just the world of those who create the product. I know that people come up with new and unique ways of using the tools that we create in ways we would never imagine.For example, I once installed 600 systems with SunOS 4.1.4 in less than a week using Jumpstart - remember that Jumpstartnever supported SunOS 4.1.4.But, I am not just looking for the weird stories. I want to know what Jumpstart features you use. I'll follow this upwith extra, detailed questions around Jumpstart Flash, WAN Boot, DHCP vs. RARP. But I want to start with just some basicsabout Jumpstart.Lacking a polling mechanism here at blogs.sun.com, you can just enter your responses as a comment. Or you can answer these questions at SurveyMonkey here. Or drop me a note at scott.dickson at sun.com. How do you install Solaris systems in your environment? I use JumpstartI use DVD or CD mediaI do something else - please tell me about it Do you have a system for automating your jumpstart configurations? Yes, we have written our ownYes, we use JETYes, we use xVM OpCenterNo, we do interactive installations via Jumpstart. We just use Jumpstart to get the bits to the client. What system architectures do you support with Jumpstart? SPARCx86 Do you use a sysidcfg file to answer the system identification questions - hostname, network, IP address, namingservice, etc? No, I answer these interactivelyYes, I hand-craft a sysidcfg fileYes, but it is created via the Jumpstart automation tools Do you use WANboot? I'll follow up with more questions on this at a later time. What's Wanboot?I have heard of it, but have never used itWe rely on Wanboot Do you use Jumpstart Flash? More questions on this later, too Never heard of itWe sometimes use FlashWe live and breathe Flash What sort of rules do you include in your rules file? We do interactive installations and don't use a rules fileWe use the rules files generated by our automation tools, like JETWe have a common rules file for all Jumpstarts based on hostnameWe use not only hostnames but also other parameters to determine which rule to use for installation Do you use begin scripts? NoWe use them to create derived profiles for installationWe use them some other way Do you use finish scripts NoWe use the finish scripts created by our automationWe use finish scripts to do some minor cleanupWe do extensive post-installation customization via finish scripts. If so, please tell me about it. Do you customize the list of packages to be installed via Jumpstart? NoSomewhatNot only do we customize the list of packages, but we create custom packages for our installation

Jumpstart is the technology within Solaris that allows a system to be remotely installed across a network. This featurehas been in the OS for a long, long time, dating to the start of Solaris 2.0, I...

Solaris

A Much Better Way to use Flash and ZFS Boot

A Different Approach A week or so ago, I wrote about a way to get around the current limitation of mixing flash and ZFS root in Solaris 10 10/08.Well, here's a much better approach.I was visiting with a customer last week and they were very excited to move forward quickly with ZFS boot in their Solaris 10environment, even to the point of using this as a reason to encourage people to upgrade. However, when they realized thatit was impossible to use Flash with Jumpstart and ZFS boot, they were disappointed. Their entire deployment infrastructureis built around using not just Flash, but Secure WANboot. This means that they have no alternative to Flash; the images deployedvia Secure WANBoot are always flash archives. So, what to do? It occurred to me that in general, the upgrade procedure from a pre-10/08 update of Solaris 10 to Solaris 10 10/08 with a ZFS root disk is a two-step process. First, you have to upgrade to Solaris 10 10/08 on UFS and then use lucreateto copy that environment to a new ZFS ABE. Why not use this approach in Jumpstart? Turns out that it works quite nicely. This is a framework for how to do that. You likely will want to expand on it, since one thing this does not do is give you any indication of progress once it starts the conversion. Here's the general approach: Create your flash archive for Solaris 10 10/08 as you usually would. Make sure you include all the appropriate LiveUpgradepatches in the flash archive. Use Jumpstart to deploy this flash archive to one disk in the target system. Use a finish script to add a conversion program to run when the system reboots for the first time. It is necessary to makethis script run once the system has rebooted so that the LU commands run within the context of the fully builtnew system. Details of this approach Our goal when complete is to have the flash archive installed as it always has been, but to have it running from a ZFS rootpool, preferably a mirrored ZFS pool. The conversion script requires two phases to complete this conversion. The first phasecreates the ZFS boot environment and the second phase mirrors the root pool. The following in this example, our flash archiveis called s10u6s.flar. We will install the initial flash archive onto the disk c0t1d0 and built our initial root pool on c0t0d0. Here is the Jumpstart profile used in this example: install_type flash_installarchive_location nfs nfsserver:/export/solaris/Solaris10/flash/s10u6s.flarpartitioning explicitfilesys c0t1d0s1 1024 swapfilesys c0t1d0s0 free / We specify a simple finish script for this system to copy our conversion script into place: cp ${SI_CONFIG_DIR}/S99xlu-phase1 /a/etc/rc2.d/S99xlu-phase1 You see what we have done: We put a new script into place to run at the end of rc2 during the first boot.We name the script so that it is the last thing to run. The x in the name makes sure that this willrun after other S99 scripts that might be in place. As it turns out, the luactivate that we willdo puts its own S99 script in place, and we want to come after that. Naming ours S99x makes it happen later in theboot sequence. So, what does this magic conversion script do? Let me outline it for you: Create a new ZFS pool that will become our root pool Create a new boot environment in that pool using lucreate Activate the new boot environment Add the script to be run during the second phase of the conversion Clean up a bit and reboot That's Phase 1. Phase 2 has its own script to be run at the same time that finishes the mirroring of the root pool.If you are satisfied with a non-mirrored pool, you can stop here and leave phase 2 out. Or you might prefer to makethis step a manual process once the system is built. But, here's what happens in Phase 2: Delete the old boot environment Add a boot block to the disk we just freed. This example is SPARC, so use installboot. For x86, youwould do something similar with installgrub. Attach the disk we freed from the old boot environment as a mirror of the device used to build the new root zpool. Clean up and reboot. I have been thinking it might be worthwhile to add a third phase to start a zpool scrub, which will forcethe newly attached drive to be resilvered when it reboots. The first time something goes to use this drive, it willnotice that it has not been synced to the master drive and will resilver it, so this is sort of optional. The reason we add bootability explicitly to this drive is because currently, when a mirror is attached to a root zpool,a boot block is not automatically installed. If the master drive were to fail and you were left with only the mirror, this would leave the system unbootable. By adding a boot block to it, you can boot from either drive. So, here's my simple little script that got installed as /etc/rc2.d/S99xlu-phase1. Just to make the code alittle easier for me to follow, I first create the script for phase 2, then do the work of phase 1. cat > /etc/rc2.d/S99xlu-phase2 << EOFludelete -n s10u6-ufsinstallboot -F zfs /usr/platform/`uname -i`/lib/fs/zfs/bootblk /dev/rdsk/c0t1d0s0zpool attach -f rpool c0t0d0s0 c0t1d0s0rm /etc/rc2.d/S99xlu-phase2init 6EOFdumpadm -d swapzpool create -f rpool c0t0d0s0lucreate -c s10u6-ufs -n s10u6 -p rpoolluactivate -n s10u6rm /etc/rc2.d/S99xlu-phase1init 6 I think that this is a much better approach than the one I offered before, using ZFS send. This approachuses standard tools to create the new environment and it allows you to continue to use Flash as a way to deploy archives. The dependency is that you must have two drives on the target system. I think that's not going to be a hardship, since most folks will use two drives anyway. You will have to keep then as separatedrives rather than using hardware mirroring. The underlying assumption is that you previously used SVM or VxVMto mirror those drives. So, what do you think? Better? Is this helpful? Hopefully, this is a little Christmas present for someone! Merry Christmas and Happy New Year!

A Different Approach A week or so ago, I wrote about a way to get around the current limitation of mixing flash and ZFS root in Solaris 10 10/08. Well, here's a much better approach. I was visiting...

Solaris

Flashless System Cloning with ZFS

Ancient HistoryGather round kiddies and let Grandpa tell you a tale of how we used to to clone systems before we had Jumpstart and Flash, when we had to carry water in leaky buckets 3 miles through snow up to our knees, uphill both ways.Long ago, a customer of mine needed to deploy 600(!) SPARCstation 5 desktops all running SunOS 4.1.4. Even then, this wasan old operating system, since Solaris 2.6 had recently been released. But it was what their application required. And we only had a few days to build and deploy these systems. Remember that Jumpstart did not exist for SunOS 4.1.4, Flash did not exist for Solaris 2.6. So, our approach was to build a system, a golden image, the way we wanted to be deployed and then use ufsdump to save the contents of the filesystems.Then, we were able to use Jumpstart from a Solaris 2.6 server to boot each of these workstations. Instead of having aJumpstart profile, we only used a finish script that partitioned the disks and restored the ufsdump images. So Jumpstart just provided us clean way to boot these systems and apply the scripts we wanted to them. Solaris 10 10/08, ZFS, Jumpstart and FlashNow, we have a bit of a similar situation. Solaris 10 10/08 introduces ZFS boot to Solaris, something that many of my customers have been anxiously awaiting for some time. A system can be deployed using Jumpstart and the ZFS boot environmentcreated as a part of the Jumpstart process. But. There's always a but, isn't there. But, at present, Flash archives are not supported (and in fact do not work) as a way to install into a ZFS boot environment, either via Jumpstart or via Live Upgrade. Turns out, they use the same mechanism under the covers for this. This is CR 6690473. So, how can I continue to use Jumpstart to deploy systems, and continue to use something akin to Flash archives to speed and simplify the process? Turns out the lessons we learned years ago can be used, more or less. Combine the idea of the ufsdump with some of the ideasthat Bob Netherton recently blogged about (Solaris and OpenSolaris coexistence in the same root zpool),and you can get to a workaround that might be useful enough to get you through until Flash really is supported with ZFS root. Build a "Golden Image" SystemThe first step, as with Flash, is to construct a system that you want to replicate. The caveat here is that you use ZFS for the root of this system. For this example, I have left /var as part of the root filesystem rather than a separate dataset, though this process could certainly be tweaked to accommodate a separate /var.Once the system to be cloned has been built, you save an image of the system. Rather than using flarcreate, you will create a ZFS send stream and capture this in a file. Then move that file to the jumpstart server, just as you would with a flash archive. In this example, the ZFS bootfs has the default name - rpool/ROOT/s10s_u6wos_07. golden# zfs snapshot rpool/ROOT/s10s_u6wos_07@flargolden# zfs send -v rpool/ROOT/s10s_u6wos_07@flar > s10s_u6wos_07_flar.zfsgolden# scp s10s_u6wos_07_flar.zfs js-server:/flashdirectory How do I get this on my new server?Now, we have to figure out how to have this ZFS send stream restored on the new clone systems. We would like to take advantage of the fact that Jumpstart will create the root pool for us, along with the dump and swap volumes, and will set up all of the neededbits for the booting from ZFS. So, let's install the minimum Solaris set of packages just to get these side effects.Then, we will use Jumpstart finish scripts to create a fresh ZFS dataset and restore our saved image into it. Since this new dataset will contain the old identity of the original system, we have to reset our system identity. But once we do that, we are good to go. So, set up the cloned system as you would for a hands-free jumpstart. Be sure to specify the sysid_config and install_config bits in the /etc/bootparams. The manual Solaris 10 10/08 Installation Guide: Custom JumpStart and Advanced Installationscovers how to do this. We add to the rules file a finish script (I called mine loadzfs in this case) that will do theheavy lifting. Once Jumpstart installs Solaris according to the profile provided, it then runs the finish script to finish upthe installation. Here is the Jumpstart profile I used. This is a basic profile that installs the base, required Solaris packages into a ZFS poolmirrored across two drives. install_type initial_installcluster SUNWCreqsystem_type standalonepool rpool auto auto auto mirror c0t0d0s0 c0t1d0s0bootenv installbe bename s10u6_req The finish script is a little more interesting since it has to create the new ZFS dataset, set the right properties, fill it up, reset the identity, etc. Below is the finish script that I used. #!/bin/sh -x# TBOOTFS is a temporary dataset used to receive the streamTBOOTFS=rpool/ROOT/s10u6_rcv# NBOOTFS is the final name for the new ZFS datasetNBOOTFS=rpool/ROOT/s10u6fMNT=/tmp/mntzFLAR=s10s_u6wos_07_flar.zfsNFS=serverIP:/export/solaris/Solaris10/flash# Mount directory where archive (send stream) existsmkdir ${MNT}mount -o ro -F nfs ${NFS} ${MNT}# Create file system to receive ZFS send stream &# receive it. This creates a new ZFS snapshot that# needs to be promoted into a new filesystemzfs create ${TBOOTFS}zfs set canmount=noauto ${TBOOTFS}zfs set compression=on ${TBOOTFS}zfs receive -vF ${TBOOTFS} < ${MNT}/${FLAR}# Create a writeable filesystem from the received snapshotzfs clone ${TBOOTFS}@flar ${NBOOTFS}# Make the new filesystem the top of the stack so it is not dependent# on other filesystems or snapshotszfs promote ${NBOOTFS}# Don't automatically mount this new dataset, but allow it to be mounted# so we can finalize our changes.zfs set canmount=noauto ${NBOOTFS}zfs set mountpoint=${MNT} ${NBOOTFS}# Mount newly created replica filesystem and set up for# sysidtool. Remove old identity and provide new identityumount ${MNT}zfs mount ${NBOOTFS}# This section essentially forces sysidtool to reset system identity at# the next boot.touch /a/${MNT}/reconfiguretouch /a/${MNT}/etc/.UNCONFIGUREDrm /a/${MNT}/etc/nodenamerm /a/${MNT}/etc/.sysIDtool.statecp ${SI_CONFIG_DIR}/sysidcfg /a/${MNT}/etc/sysidcfg# Now that we have finished tweaking things, unmount the new filesystem# and make it ready to become the new root.zfs umount ${NBOOTFS}zfs set mountpoint=/ ${NBOOTFS}zpool set bootfs=${NBOOTFS} rpool# Get rid of the leftoverszfs destroy ${TBOOTFS}zfs destroy ${NBOOTFS}@flar When we jumpstart the system, Solaris is installed, but it really isn't used. Then, we load from the send stream a whole new OS dataset, make it bootable, set our identity in it, and use it. When the system is booted, Jumpstart still takes care of updating the boot archives in the new bootfs. On the whole, this is a lot more work than Flash, and is really not as flexible or as complete. But hopefully, until Flash is supported with a ZFS root and Jumpstart, this might at least give you an idea of how you can replicate systemsand do installations that do not have to revert back to package-based installation. Many people use Flash as a form of disaster recover. I think that this same approach might be used there as well. Still not as clean or complete as Flash, but it might work in a pinch. So, what do you think? I would love to hear comments on this as a stop-gap approach.

Ancient History Gather round kiddies and let Grandpa tell you a tale of how we used to to clone systems before we had Jumpstart and Flash,when we had to carry water in leaky buckets 3 miles through...

Everything Else

CEC Day One & Two Recap

So, CEC is actually almost over. It's been a whirlwind of sessions, meet-ups with folks, filling my head with new stuff. And, of course in the midst of all of the CEC excitement, there's still the need to keep up with what customers back home need.So, what was exciting from days one and two? Lots!Jon Haslam and Simon Ritter gave a great talk and demo about using DTrace along with Java. I am absolutely not a developer; never even written "Hello World" in Java. But, this really helped me understand how DTrace and Java are two great tastes that go great together. And with the newer JVMs, it really is a case of "Hey, you got your DTrace in my Java!", "No, you got your Java in my DTrace!" This all comes at a great time -- I have to do a presentation on Wednesday in Florida on exactly this topic.Matt Ingenthron and Shanti gave a great talk about the various working parts and commonly used components and tools in a modern web infrastructure. Really helped me figure out how the pieces fit together.Tim Cook had a great talk comparing the various file system offerings from Sun and others for OLTP workloads on large systems. He gave us some handy, simple, best practices for each and worked to bust some commonly held myths and misconceptions.Tim Bray shared his perspective on what really is important about a Web 2.0 world, about how the things in that world can really matter to an enterprise. He talked about the fact that, end the end, time to market and managability are the overwhelming priorities for enterprises in selecting tools and techniques for application development and deployment. I am really inspired to go out and finally learn more about Ruby and Rails as a result.Of course, there were more. These are just some of the highlights that come to mind quickly. As always, CEC was a great trip and well worth the effort (but I still dislike Las Vegas - a lot). And like Juan Antonio Samaranch at the Olypics, this CEC is about to be declared over, realized to be the best yet, and we will agree to meet again next year. I, for one am looking forward to it. Time to start working on a topic for my presentation!

So, CEC is actually almost over. It's been a whirlwind of sessions, meet-ups with folks, filling my head with new stuff. And, of course in the midst of all of the CEC excitement, there's still the...

Everything Else

CEC2007 - Early Day One - Initial Impressions

So, it's the first day of CEC, Sun's Customer Engineering Conference. This year, there are about 4000 of us hanging out at Paris & Bally's hotels in Las Vegas. Systems Engineers, folks from Sun's various practices, Service & Support engineers, architects, folks from headquarters engineering are all here. But, we also have a huge number of our partners - resellers, OEMs, developers, etc.Last night was our Networking Reception. Great to see folks again that I had not seen in a while and to meet lots of new faces.Today, we start with opening sessions from Hal Stern, Dan Berg, Jim Baty, and a host of others. Then, we get into, for me, the guts of CEC - the breakout sessions. There are over 240 sessions, selected from a pool of over 700 submissions. I'm talking (Tuesday, 6PM, Versailles ballroom 3 & 4) on Dynamic Resource Pools in Solaris 10. I'll post my slides after the talk. If you are at the conference, come on over. I understand my talk will also be available in Second Life. I'm still trying to figure out how all of that works, though.Here are some of my initial + and - observations from CEC so far:Plus - Paris is great. Very lovely hotel. The look really captures all that you might remember and love from Paris.Plus - I scored a deluxe room - corner room, view of the Bellagio fountains, windows on two walls.Plus - Check-in logistics. Got through even the really long materials line in less than 10 minutes.Plus - Networking Reception - Food was good and plentiful. Double plus for the desserts. Great to see folks. Last year, I missed the reception since I got in late.Plus / Minus - In-room network. Fastest hotel network I have had in as long as I can recall. But it costs $13/day.Minus - Room for meals was really, really, really crowded for breakfast. I can only imagine as folks try to rush through for lunch. And no sodas. Last year, folks finally got it that geeks often take their caffeine in a carbonated form.Minus - Having the agenda only on-line via schedule builder has made it sort of inconvenient to select sessions, alter you plans, and pick new things on the fly. Same as last year. Sometimes paper really is useful.Minus - Smoke - Las Vegas is smoky. Seems that they are managing it better now than in years past, but in these days of smoke-free public spaces, it's really noticeable.Big Minus - For me, Las Vegas is absolutely not my top choice for a venue. For me, this is a very uncomfortable place. Maybe I'm just a stick in the mud or a prude or old in my thinking, but this town is just about too many things that really make me uncomfortable.All in all, though I am excited about a great conference and expect to be really tired when I get home!

So, it's the first day of CEC, Sun's Customer Engineering Conference. This year, there are about 4000 of us hanging out at Paris & Bally's hotels in Las Vegas. Systems Engineers, folks from Sun's...

Everything Else

I've been everywhere man, I've been everywhere

I feel like that Johnny Cash song (which I think maybe Jimmy Rodgers did first - can't recall for sure).  Seems like for the last several months, I've been on the road doing Solaris bootcamps, best practices workshops, and all sorts of other things Solaris.  I've seen a lot of interesting places and met lots of interesting folks.  Just the last few weeks, I've been to:Bismarck, ND, Sioux Falls, SD, Fargo, ND for University Solaris Bootcamps.  Got to see lots of that area driving from one to another across the secondary highways.  Thanks to Greg Stromme from Applied Engineering, Sun's reseller partner in that geography, for driving me and showing places I'd never been before.  We saw the homeplaces of Lawrence Welk and Laura Ingalls Wilder, plus lots of wide-open territoryConway, Arkansas for Solaris resource management workshop.  Got to see a cousin in Russellville this trip.Austin, Texas for Solaris virtualization workshop.Baton Rouge, LA for University Solaris bootcamp - Got to see a cousin here, tooHuntsville, AL for various Solaris briefingsAnd that's just the last six weeks!  I'm kind of thankful for the end of the quarter and the year coming up.  I have no tickets booked until the end of July right now!Powered by ScribeFire.

I feel like that Johnny Cash song (which I think maybe Jimmy Rodgers did first - can't recall for sure).  Seems like for the last several months, I've been on the road doing Solaris bootcamps, best...

Solaris

Fun with zvols - UFS on a zvol

Continuing with some of the ideas around zvols, I wondered about UFS on a zvol.  On the surface, this appears to be sort of redundant and not really very sensible.  But thinking about it, there are some real advantages.I can take advantage of the data integrity and self-healing features of ZFS since this is below the filesystem layer.I can easily create new volumes for filesystems and grow existing onesI can make snapshots of the volume, sharing the ZFS snapshot flexibility with UFS - very coolIn the future, I should be able to do things like have an encrypted UFS (sort-of) and secure deletion Creating UFS filesystems on zvolsCreating a UFS filesystem on a zvol is pretty trivial.  In this example, we'll create a mirrored pool and then build a UFS filesystem in a zvol.bash-3.00# zpool create p mirror c2t10d0 c2t11d0 mirror c2t12d0 c2t13d0bash-3.00# zfs create -V 2g p/v1bash-3.00# zfs listNAME     USED  AVAIL  REFER  MOUNTPOINTp       4.00G  29.0G  24.5K  /pp/v1    22.5K  31.0G  22.5K  -bash-3.00# newfs /dev/zvol/rdsk/p/v1newfs: construct a new file system /dev/zvol/rdsk/p/v1: (y/n)? yWarning: 2082 sector(s) in last cylinder unallocated/dev/zvol/rdsk/p/v1:    4194270 sectors in 683 cylinders of 48 tracks, 128 sectors        2048.0MB in 43 cyl groups (16 c/g, 48.00MB/g, 11648 i/g)super-block backups (for fsck -F ufs -o b=#) at: 32, 98464, 196896, 295328, 393760, 492192, 590624, 689056, 787488, 885920, 3248288, 3346720, 3445152, 3543584, 3642016, 3740448, 3838880, 3937312, 4035744, 4134176bash-3.00# mkdir /fs1bash-3.00# mount /dev/zvol/dsk/p/v1 /fs1bash-3.00# df -h /fs1Filesystem             size   used  avail capacity  Mounted on/dev/zvol/dsk/p/v1     1.9G   2.0M   1.9G     1%    /fs1Nothing much to it.  Growing UFS filesystems on zvolsBut, what if I run out of space?  Well, just as you can add disks to a volume and grow the size of the volume, you can grow the size of a zvol.  Now, since the UFS filesystem is a data structure inside zvol container, you have to grow it as well.  Were I using just zfs, the size of the file system would grow and shrink dynamically with the size of the data in the file system.  But  a UFS has a fixed size, so it has to be expanded manually to accomodate the enlarged volume.  Now, this seems to have quite working between b45 and b53, so I just filed a bug on this one.bash-3.00# uname -aSunOS atl-sewr-158-154 5.11 snv_45 sun4u sparc SUNW,Sun-Fire-480Rbash-3.00# zfs create -V 1g bsd/v1bash-3.00# newfs /dev/zvol/rdsk/bsd/v1...bash-3.00# zfs set volsize=2g bsd/v1bash-3.00# growfs /dev/zvol/rdsk/bsd/v1Warning: 2048 sector(s) in last cylinder unallocated/dev/zvol/rdsk/bsd/v1:  4194304 sectors in 683 cylinders of 48 tracks, 128 sectors        2048.0MB in 49 cyl groups (14 c/g, 42.00MB/g, 20160 i/g)super-block backups (for fsck -F ufs -o b=#) at: 32, 86176, 172320, 258464, 344608, 430752, 516896, 603040, 689184, 775328, 3359648, 3445792, 3531936, 3618080, 3704224, 3790368, 3876512, 3962656, 4048800, 4134944What about compression? Along the same lines as growing the file system, I suppose you could turn compression on for the zvol.  But since the UFS is of fixed size, it won't help especially, as far as fitting more data in the file system.  You can't put more into the filesystem than the filesystem thinks that it can hold.  Even if it isn't using that much on the disk.  Here's a little demonstration of that.First, we will loop through, creating 200MB files in a 1GB file system with no compression.  We will use blocks of zeros, since these will compress quite a bit the second time round. bash-3.00# zfs create -V 1g p/v1bash-3.00# zfs get used,volsize,compressratio p/v1NAME  PROPERTY       VALUE    SOURCEp/v1  used           22.5K    -p/v1  volsize        1G       -p/v1  compressratio  1.00x    -bash-3.00# newfs /dev/zvol/rdsk/p/v1...bash-3.00# mount /dev/zvol/dsk/p/v1 /fs1bash-3.00#bash-3.00# for f in f1 f2 f3 f4 f5 f6 f7 ; do> dd if=/dev/zero bs=1024k count=200 of=/fs1/$f> df -h /fs1> zfs get used,volsize,compressratio p/v1> done200+0 records in200+0 records outFilesystem             size   used  avail capacity  Mounted on/dev/zvol/dsk/p/v1     962M   201M   703M    23%    /fs1NAME  PROPERTY       VALUE    SOURCEp/v1  used           62.5M    -p/v1  volsize        1G       -p/v1  compressratio  1.00x    -200+0 records in200+0 records outFilesystem             size   used  avail capacity  Mounted on/dev/zvol/dsk/p/v1     962M   401M   503M    45%    /fs1NAME  PROPERTY       VALUE    SOURCEp/v1  used           149M     -p/v1  volsize        1G       -p/v1  compressratio  1.00x    -200+0 records in200+0 records outFilesystem             size   used  avail capacity  Mounted on/dev/zvol/dsk/p/v1     962M   601M   303M    67%    /fs1NAME  PROPERTY       VALUE    SOURCEp/v1  used           377M     -p/v1  volsize        1G       -p/v1  compressratio  1.00x    -200+0 records in200+0 records outFilesystem             size   used  avail capacity  Mounted on/dev/zvol/dsk/p/v1     962M   801M   103M    89%    /fs1NAME  PROPERTY       VALUE    SOURCEp/v1  used           497M     -p/v1  volsize        1G       -p/v1  compressratio  1.00x    -dd: unexpected short write, wrote 507904 bytes, expected 1048576161+0 records in161+0 records outDec  1 14:53:04 atl-sewr-158-122 ufs: NOTICE: alloc: /fs1: file system fullbash-3.00# zfs get used,volsize,compressratio p/v1NAME  PROPERTY       VALUE    SOURCEp/v1  used           1.00G    -p/v1  volsize        1G       -p/v1  compressratio  1.00x    -bash-3.00#So, you see that it fails as it writes the 5th 200MB chunk, which is what you would expect.  Now, let's do the same thing with compression turned on for the volume.bash-3.00# zfs create -V 1g p/v2bash-3.00# zfs set compression=on p/v2bash-3.00# newfs /dev/zvol/rdsk/p/v2...bash-3.00#bash-3.00# mount /dev/zvol/dsk/p/v2 /fs2bash-3.00# for f in f1 f2 f3 f4 f5 f6 f7 ; do> dd if=/dev/zero bs=1024k count=200 of=/fs2/$f> df -h /fs2> zfs get used,volsize,compressratio p/v2> done200+0 records in200+0 records outFilesystem             size   used  avail capacity  Mounted on/dev/zvol/dsk/p/v2     962M   201M   703M    23%    /fs2NAME  PROPERTY       VALUE    SOURCEp/v2  used           8.58M    -p/v2  volsize        1G       -p/v2  compressratio  7.65x    -200+0 records in200+0 records outFilesystem             size   used  avail capacity  Mounted on/dev/zvol/dsk/p/v2     962M   401M   503M    45%    /fs2NAME  PROPERTY       VALUE    SOURCEp/v2  used           8.58M    -p/v2  volsize        1G       -p/v2  compressratio  7.65x    -200+0 records in200+0 records outFilesystem             size   used  avail capacity  Mounted on/dev/zvol/dsk/p/v2     962M   601M   303M    67%    /fs2NAME  PROPERTY       VALUE    SOURCEp/v2  used           8.83M    -p/v2  volsize        1G       -p/v2  compressratio  7.50x    -200+0 records in200+0 records outFilesystem             size   used  avail capacity  Mounted on/dev/zvol/dsk/p/v2     962M   801M   103M    89%    /fs2NAME  PROPERTY       VALUE    SOURCEp/v2  used           8.83M    -p/v2  volsize        1G       -p/v2  compressratio  7.50x    -dd: unexpected short write, wrote 507904 bytes, expected 1048576161+0 records in161+0 records outDec  1 15:16:42 atl-sewr-158-122 ufs: NOTICE: alloc: /fs2: file system fullbash-3.00# zfs get used,volsize,compressratio p/v2NAME  PROPERTY       VALUE    SOURCEp/v2  used           9.54M    -p/v2  volsize        1G       -p/v2  compressratio  7.07x    -bash-3.00# df -h /fs2Filesystem             size   used  avail capacity  Mounted on/dev/zvol/dsk/p/v2     962M   962M     0K   100%    /fs2bash-3.00#This time, even though the volume was not using much space at all, the file system was full.  So compression in this case is especially valuable from a space management standpoint.  Depending on the contents of the filesystem, compression may still help the performance by converting multiple I/Os into single or fewer I/Os, though.The Cool Stuff - Snapshots and Clones with UFS on ZvolsOne of the things that is not available in UFS is the ability to create multiple snapshots quickly and easily.  The fssnap(1M) command allows me to create a single, read-only snapshot of a UFS file system.  In addition, it requires an additional location to maintain backing store for files changed or deleted in the master image during the lifetime of  the snapshot.ZFS offers the ability to create many snapshots of a ZFS filesystem quickly and easily.  This ability extends to zvols, as it turns out.For this example, we will create a volume, fill it up with some data and then play around with taking some snapshots of it.  We will just tar over the Java JDK so there are some files in the file system. bash-3.00# zfs create -V 2g p/v1bash-3.00# newfs /dev/zvol/rdsk/p/v1...bash-3.00# mount /dev/zvol/dsk/p/v1 /fs1bash-3.00# tar cf -  ./jdk/ | (cd /fs1 ; tar xf - )bash-3.00# df -h /fs1Filesystem             size   used  avail capacity  Mounted on/dev/zvol/dsk/p/v1     1.9G   431M   1.5G    23%    /fs1bash-3.00# zfs listNAME     USED  AVAIL  REFER  MOUNTPOINTp       4.00G  29.0G  24.5K  /pp/swap  22.5K  31.0G  22.5K  -p/v1     531M  30.5G   531M  -Now, we will create a snapshot of the volume, just like for any other ZFS file system.  As it turns out, this creates new device nodes in /dev/zvol for the block and character devices.  We can mount them as UFS file systems same as always.bash-3.00# zfs snapshot p/v1@s1 # Make the snapshotbash-3.00# zfs list # See that it's really thereNAME      USED  AVAIL  REFER  MOUNTPOINTp        4.00G  29.0G  24.5K  /pp/swap   22.5K  31.0G  22.5K  -p/v1      531M  30.5G   531M  -p/v1@s1      0      -   531M  -bash-3.00# mkdir /fs1-s1bash-3.00# mount  /dev/zvol/dsk/p/v1@s1 /fs1-s1 # Mount itmount: /dev/zvol/dsk/p/v1@s1 write-protected # Snapshots are read-only, so this failsbash-3.00# mount -o ro  /dev/zvol/dsk/p/v1@s1 /fs1-s1 # Mount again read-onlybash-3.00# df -h /fs1-s1 /fs1Filesystem             size   used  avail capacity  Mounted on/dev/zvol/dsk/p/v1@s1                       1.9G   431M   1.5G    23%    /fs1-s1/dev/zvol/dsk/p/v1     1.9G   431M   1.5G    23%    /fs1bash-3.00#At this point /fs1-s1 is a read-only snapshot of /fs1.  If I delete files, create files, or change files in /fs1, that change will not be reflected in /fs1-s1.bash-3.00# ls /fs1/jdkinstances    jdk1.5.0_08  jdk1.6.0     latest       packagesbash-3.00# rm -rf /fs1/jdk/instancesbash-3.00# df -h /fs1 /fs1-s1Filesystem             size   used  avail capacity  Mounted on/dev/zvol/dsk/p/v1     1.9G    61M   1.8G     4%    /fs1/dev/zvol/dsk/p/v1@s1                       1.9G   431M   1.5G    23%    /fs1-s1bash-3.00#Just as you can create multiple snapshots.  And as with any other ZFS file system, you can rollback a snapshot and make it the master again.  You have to unmount the filesystem in order to do this, since the rollback is at the volume level.  Changing the volume underneath the UFS filesystem would leave UFS confused about the state of things.  But, ZFS catches this, too.bash-3.00# ls /fs1/jdk/jdk1.5.0_08  jdk1.6.0     latest       packagesbash-3.00# rm /fs1/jdk/jdk1.6.0bash-3.00# ls /fs1/jdk/jdk1.5.0_08  latest       packagesbash-3.00# zfs listNAME      USED  AVAIL  REFER  MOUNTPOINTp        4.00G  29.0G  24.5K  /pp/swap   22.5K  31.0G  22.5K  -p/v1      535M  30.5G   531M  -p/v1@s1  4.33M      -   531M  -bash-3.00# zfs rollback p/v1@s2 # /fs1 is still mounted. cannot remove device links for 'p/v1': dataset is busybash-3.00# umount /fs1bash-3.00# zfs rollback p/v1@s2bash-3.00# mount /dev/zvol/dsk/p/v1 /fs1bash-3.00# ls /fs1/jdkjdk1.5.0_08  jdk1.6.0     latest       packagesbash-3.00#I can create additional read-write instances of a volume by cloning the snapshot.  The clone and the master file system will share the same objects on-disk for data that remains unchanged, while new on-disk objects will be created for any files that are changed either in the master or in the clone.bash-3.00# ls /fs1/jdkjdk1.5.0_08  jdk1.6.0     latest       packagesbash-3.00# zfs snapshot p/v1@s1bash-3.00# zfs clone p/v1@s1 p/c1bash-3.00# zfs listNAME      USED  AVAIL  REFER  MOUNTPOINTp        4.00G  29.0G  24.5K  /pp/c1         0  29.0G   531M  -p/swap   22.5K  31.0G  22.5K  -p/v1      531M  30.5G   531M  -p/v1@s1      0      -   531M  -bash-3.00# mkdir /c1bash-3.00# mount /dev/zvol/dsk/p/c1 /c1bash-3.00# ls /c1/jdkjdk1.5.0_08  jdk1.6.0     latest       packagesbash-3.00# df -h /fs1 /c1Filesystem             size   used  avail capacity  Mounted on/dev/zvol/dsk/p/v1     1.9G    61M   1.8G     4%    /fs1/dev/zvol/dsk/p/c1     1.9G    61M   1.8G     4%    /c1bash-3.00#I think am pretty sure that this isn't exactly what the ZFS guys had in mind when they set out to build all of this, but this is pretty cool.  Now, I can create UFS snapshots without having to specify a backing store.  I can create clones, promote the clones to the master, and the other things that I can do in ZFS.  I still have to manage the mounts myself, but I'm better off than before.I have not tried any sort of performance testing on these.  Dominic Kay has just written a nice blog about using filebench to compare ZFS and VxFS.  Maybe I can use some of that work to see how things go with UFS on top of ZFS.As always, comments, etc. are welcome!

Continuing with some of the ideas around zvols, I wondered about UFS on a zvol.  On the surface, this appears to be sort of redundant and not really very sensible.  But thinking about it, there are...

Solaris

Fun with zvols - Swap on a zvol

I mentioned recently that I just spent a week in a ZFS internals TOI. Got a few ideas to play with there that I will share. Hopefully folks might have suggestions as to how to improve / test / validate some of these things.ZVOLs as SwapThe first thing that I thought about was using ZFS as a swap device. Of course, this is right there in the zfs(1) man page as an example, but it still deserves a mention here.  There has been some discussion of this on the zfs-discuss list at opensolaris.org (I just retyped that dot four times thinking it was a comma. Turns out there was crud on my laptop screen).  The dump device cannot be on a zvol (at least if you want to catch a crash dump) but this still gives a lot of flexibility.  With root on ZFS (coming before too long) ZFS swap makes a lot of sense and is the natural choice. We were talking in class that maybe it would be nice if there were a way to turn off ZFS' caching for the swap surface to improve performance, but that remains to be seen.At any rate, setting up mirrored swap with ZFS is way simple! Much simpler even than with SVM, which in turn is simpler than VxVM. Here's all it takes:bash-3.00# zpool create -f p mirror c2t10d0 c2t11d0bash-3.00# zfs create -V 2g p/swapbash-3.00# swap -a /dev/zvol/dsk/p/swapPretty darn simple, if you ask me. You can make it permanent by changing the lines for swap in your /etc/vfstab (below).  Notice that you use the path to the zvol in the /dev tree rather than the ZFS dataset name.bash-3.00# cat /etc/vfstab#device device mount FS fsck mount mount#to mount to fsck point type pass at boot options##/dev/dsk/c1t0d0s1 - - swap - no -/dev/zvol/dsk/p/swap - - swap - no -I would like to do some performance testing to see what kind of performance you can get with swap on a zvol.  I am curious about how this will affect kernel memory usage.  I am curious about the effect of things like compression on the swap volume.  Thinking about that one, it doesn't make a lot of sense.  I am also curious about the ability to dynamically change the size of the swap space.  At first glance, changing the size of the volume does not automatically change the amount of available swap space.  That makes sense.  That makes sense for expanding swap space.  But if you reduce the size of the volume and the kernel doesn't notice, that sounds like a it could be a problem.  Maybe I should file a bug.Suggestions for things to try and ways to measure overhead and performance for this are welcomed.

I mentioned recently that I just spent a week in a ZFS internals TOI. Got a few ideas to play with there that I will share. Hopefully folks might have suggestions as to how to improve / test /...

Solaris

ZILs and ZAPs and ARCs - Oh My!

I just spent the last four days in a ZFS Intenals TOI, given by George Wilson from RPE.  This just reinforces my belief that the folks who build OpenSolaris (and most any complex software product, actually) have a special gift.  How one can conceive of all of the various parts and pieces to bring together something as cool as OpenSolaris or ZFS or DTrace, etc., is beyond me.By way of full disclosure, I ought to admit that the main thing I learned in graduate school and while working as a developer in a CO-OP job at IBM was that I hate development.  I am not cut out for it and have no patience for it.Anyway, though, spending a week in the ZFS source actually helps you figure out how to best use the tool at a user level.  You how things fit together and this helps to figure out how to build solutions.  I got a ton of good ideas about some things that you might do with ZFS even without moving all of your data to ZFS.  Don't know whether they will pan out or not, but some ideas to play around with.  More about that later.Same kind of thing applies for internals of the kernel.  Whether or not you are a kernel programmer, you can be a better developer and a better system administrator if you have a notion of how the pieces of the kernel fit together.  Sun Education is now offering a class called Solaris 10 Operating System, previously only offered internally at Sun.  Since Solaris has been open-sourced, the internal Internals is now and external Internals!  If you have a chance, take this class!  I take it every couple of Solaris releases and never regret it.But, mostly I want to say a special thanks to George Wilson and the RPE team for putting together a fantastic training event and for allowing me, from the SE / non-developer side of the house to sit in and bask in the glow of those who actually make things for a living.

I just spent the last four days in a ZFS Intenals TOI, given by George Wilson from RPE.  This just reinforces my belief that the folks who build OpenSolaris (and most any complex software product,...

Everything Else

CEC Day 1

Got into SFO Sunday evening and went straightaway to the reception at the Hilton. It's always great to see the folks that you have worked with over the years and don't get to see very often. Networking is as important as anything else at these events. If social networking is important in a Web 2.0 world, it only got that way because social networking in our day to day life is how we get stuff accomplished.Laura Ramsey and I hosted an OpenSolaris BOF at lunchtime. We had a pretty good crowd and had short, "lightning talks" from a number of folksIenup Sung talked about Internationalization and Localization in OpenSolarisKen Drachnik talked about GlassfishJeff Savit talked about some cool stuff going on with ports of OpenSolaris to "alternative platforms." More on this as it is ready for prime time.Iwan Rahabok talked about Singanix and the OpenSolaris user group in SingaporeBruno Gillet talked about how he uses OpenSolaris as a tool to demonstrate new features that will appear in Solaris and how important OpenSolaris can be to Sun's engineers as a day-to-day too.After lunch, the breakout sessions began, the real reason we come to CEC.I heard Jim Mauro talk (and make himself tired in a mad dash through more slides than minutes) on Solaris POD - Performance, Observability, and Debugging - tools. I saw a talk about new features in Sun Cluster 3.2 that make upgrades of not only the cluster, but also the OS and application easier and with less interruption in service.I saw two good talks on ZFS. One by Detlef Drewanz and Constantin Gonzalez on how they use ZFS and some of the reliability metrics around various configurations of disks. And one by Roch Bourbonais about some of the ZFS implementation details. That one just whet my appetite for a week long deep dive into ZFS internals I hope to attend in December.Now, it's time to start again. Andy B. is on tap today for the general session. I plan to hear Richard Elling talk about RAS for sure, but I don't know what else. Busy, Busy, Busy!!Technorati: topic:[cec2006]

Got into SFO Sunday evening and went straightaway to the reception at the Hilton. It's always great to see the folks that you have worked with over the years and don't get to see very often....

Everything Else

CEC 2006, Here I Come!

I'm on my way, like so many others at Sun, to CEC 2006 on Sunday. Sounds like there will be nearly 3000 Sun engineers and architects from around the world convening at the Moscone Center in San Francisco. This year is the first time that I have attended without presenting a paper. Maybe I'll get to see more presentations this way!One of the highlights of CEC is the many BOFs - Birds of a Feather sessions. Laura Ramsey and I are hosting a BOF for OpenSolaris on Monday over lunch. Plan is to have several Lightning Talks - 5-8 minute, very brief presentations on a variety of topics. We've got Lightning Talks lined up on Security, Trusted Extensions, I18N & L10N, Glassfish, and a bunch of other stuff. Shame we have only about an hour for the meeting. If you are CEC and are looking for a BOF to attend on Monday, try the OpenSolaris one!Also like others, I am combining CEC with an Ambassador meeting, but for me it's OS Ambassadors rather than DC Ambassadors. 50 or so of us from around the world who focus on Solaris will get together with Solaris engineering and marketing. It's always a great meeting and a good time to see folks that you don't see very frequently. So, look for a few more blog entries here on things that I see that might be interesting to pass along.Technorati Tags: cec2006

I'm on my way, like so many others at Sun, to CEC 2006 on Sunday. Sounds like there will be nearly 3000 Sun engineers and architects from around the world convening at the Moscone Center in San...

Solaris

I've built it. but now what?

ZFS on a box like the SunFire X4500 is way cool. But what if all you have is old, controller-based storage devices? George Wilson and I were wondering about that and thought it might be useful to do some experimentation down that line. So, we collected all of the currently unused storage in my lab and built a big ZFS farm. We've got a V480 with 7 T3B and 8 T3 bricks connected via Sanbox-1 switches, along with a couple of TB of IBM Shark storage recabled to be JBOD. I have a 3510 and maybe some Adaptec RAID storage that I can hook up eventually.So, the server is up and running with a 3 racks of storage, keeping the lab nice and toasty. Now what?!What might be the best way to manage the T3s in a ZFS world? As a first pass, I split each brick into 2 RAID5 LUNs with a shared spare drive. But, maybe I would be better off just creating a single stripe with no RAID in the T3 and let ZFS handle the mirroring. But, I've had a number of disk errors (these are all really, really, really old) that the T3 fixed on its own w/o bothering the ZFS pool. Maybe RAID5 in the brick is the right approach. I could argue either way.Feel free to share your suggestions on what might be a good configuration here and why. I'm happy to test out several different approaches.

ZFS on a box like the SunFire X4500 is way cool. But what if all you have is old, controller-based storage devices? George Wilson and I were wondering about that and thought it might be useful to do...

Solaris

Hotlanta Celebrates One Year of OpenSolaris

The Atlanta OpenSolaris User Group launched a bit of an early birthday celebration for our good friend, OpenSolaris, last night with a rousing meeting. George Wilson, from Sun's Revenue Products Engineering group, gave us an update on what's new in ZFS lately. I have to say that I am more and more impressed with the things that you can and will be able to do with ZFS. George and I were talking about how one might use promotion of cloned ZFS filesystems as a part of a Q/A and patching process, especially for zones sitting in a ZFS filesystem. I am not yet sure of exactly how all of this might work, but I think it has promise.George also talked about using ZFS for the root filesystem and booting from a ZFS filesystem. Also very cool. Seems to me like this has a lot of benefits. You never will have to go through the pain of backing up and restoring a root drive to resize /var or /opt! Plus, you get the added safety and security of ZFS. Old-timers who want to see a root disk that looks like a simple disk may have to rethink things a little, but I think the added benefits will outweigh the effort of change.After George's talk, I took the stage and talked about integrating Zones and ZFS. I'm pretty excited about this. On the one hand, being able to use ZFS to provide application data space to a zone allows the zone administrator to take on the responsibility of managing their own filesystems to fit their needs, without bothering the global platform adminstrator. On the other hand, using ZFS for the zoneroot, I can easily and quickly create new zones, cloning them from a master, using ZFS properties to keep them from stomping on one another. All very cool. I have to congratulate the whole ZFS team (and the Zones team).I am looking forward to our next meeting - July 11 - when we will hear from Alok Aggarwal on NFSv4. We got a good list of suggested topics that should keep us going through the fall.

The Atlanta OpenSolaris User Group launched a bit of an early birthday celebration for our good friend, OpenSolaris, last night with a rousing meeting. George Wilson, from Sun's Revenue...

Solaris

It's neat, but is it useful?

Sometimes weird ideas occur to me while I'm on airplanes. The other day, while flying to a customer engagement, I was thinking about the fact that customers often ask about how to manage usernames and passwords between the global zone and non-global zones in Solaris 10. Certainly, you can use a centrally managed solution such as LDAP or NIS, but many of these customers don't have anything like that. Moreover, they only have a few users on any particular system and want all of the users in the global zone to be known in the non-global zones as well.So, this got me to thinking. What if we use loopback mounts for things like /etc/passwd and /etc/shadow? Hey, yeah! That's the ticket! That might work! If I make a readonly mount of these files, I bet I can access them in the non-global zone. If I make then read-only, they end up being managed from the global zone, and less likely to be a security problem.And what about /etc/hosts? Well, probably there's DNS, but not necessarily. I have customers who have 50,000+ line host files. They would love to share these, too. So, why not mount /etc/inet while we're at it?Here's what I did. I have a zone called z4 whose zoneroot is located at /zones/z4. I had already created this zone previously, so I will just use zonecfg to make some modifications to the existing zone:global# mv /zones/z4/root/etc/passwd /zones/z4/root/etc/passwd.safeglobal# mv /zones/z4/root/etc/shadow /zones/z4/root/etc/shadow.safezonecfg -z z4zonecfg:z4> add fszonecfg:z4:fs> set dir=/etc/passwdzonecfg:z4:fs> set special=/etc/passwdzonecfg:z4:fs> set type=lofszonecfg:z4:fs> add options [ro,nodevices]zonecfg:z4:fs> endzonecfg:z4> add fszonecfg:z4:fs> set dir=/etc/shadowzonecfg:z4:fs> set special=/etc/shadowzonecfg:z4:fs> set type=lofszonecfg:z4:fs> add options [ro,nodevices]zonecfg:z4:fs> endzonecfg:z4> add fszonecfg:z4:fs> set dir=/etc/inetzonecfg:z4:fs> set special=/etc/inetzonecfg:z4:fs> set type=lofszonecfg:z4:fs> add options [ro,nodevices]zonecfg:z4:fs> endzonecfg:z4> verifyzonecfg:z4> commitzonecfg:z4> \^DWhen I boot up the zone and take a look at what's mounted, I now see this:# uname -aSunOS z4 5.10 Generic_Patch i86pc i386 i86pc# zonenamez4# df -hFilesystem size used avail capacity Mounted on/ 5.9G 3.5G 2.3G 61% //dev 5.9G 3.5G 2.3G 61% /dev/etc/inet 5.9G 3.5G 2.3G 61% /etc/inet/etc/passwd 5.9G 3.5G 2.3G 61% /etc/passwd/etc/shadow 5.9G 3.5G 2.3G 61% /etc/shadow/lib 5.9G 3.5G 2.3G 61% /lib/opt 3.9G 1.6G 2.3G 42% /opt/platform 5.9G 3.5G 2.3G 61% /platform/sbin 5.9G 3.5G 2.3G 61% /sbin/usr 5.9G 3.5G 2.3G 61% /usrproc 0K 0K 0K 0% /procctfs 0K 0K 0K 0% /system/contractswap 1.5G 240K 1.5G 1% /etc/svc/volatilemnttab 0K 0K 0K 0% /etc/mnttab/usr/lib/libc/libc_hwcap2.so.1 5.9G 3.5G 2.3G 61% /lib/libc.so.1fd 0K 0K 0K 0% /dev/fdswap 1.5G 0K 1.5G 0% /tmpswap 1.5G 16K 1.5G 1% /var/runNow, I can log directly into the zone using the same username and password as the global zone. This seems like it could be pretty cool. /etc/passwd, /etc/shadow, and /etc/inet are all mount points from the global zone.I am not sure that it's really useful. What does anyone else think? Is this a technique that should be strongly discouraged? Or something that we need to document and encourage?One thing that this makes me think of is a potential RFE for zonecfg. It would be nice to be able to somehow have an include operator, so that you can pull in common segments to be added to each zone configuration. But maybe the right way to do this is to just do this in a script.Thoughts? Comments?

Sometimes weird ideas occur to me while I'm on airplanes. The other day, while flying to a customer engagement, I was thinking about the fact that customers often ask about how to manage usernames and...

Solaris

Atlanta OpenSolaris User Group - May 9

I can't believe that I've let things go so far away from me that my last post was in November. Here it is May already! Lots of news from ATLOSUG (Atlanta OpenSolaris User Group). Since November, we have had a couple of meetings in January and March, moved around trying to find a better venue. The last meeting was a great overview of ZFS, given by George Wilson,one of the engineers involved in the port of ZFS from Nevada back to Solaris 10. This was the first meeting held at theSun office in Alpharetta, Georgia. Had a great turnout. Lots of discussion and questions. We could have gone on foranother hour or more. Expect to hear more from George on ZFS in the coming months.The next meeting of ATLOSUG will be May 9 at 7PM at the Sun office. Check the ATLOSUG site for directions and details.Matrix Resources is sponsoring this meeting for us, and we thank them for their support (and for the refreshments!).Our topic for this meeting is BrandZ - Running Linux applications in a Solaris zone. Expect to see some slides, and then a bunch of demos of how to build, install branded zones, running applications in zones, and then some cool interactions of zones and ZFS, zones and DTrace.Another piece of news regarding the ATLOSUG, starting with the May meeting, we are altering the schedule to meet monthly ratherbimonthly. Seems like there is enough going on and enough people interested to keep us going at that pace. So, the next meeting will be June 12, at the Sun office. Regarding the location, admittedly, in the Atlanta area, meeting locations are a challenge. As a stand-alone user group, we needa meeting location that doesn't cost very much, is accessible in the evening, and is as convenient to some large portion of the city as possible. Meeting downtown or midtown is inconvenient to many folks on the north side. Meeting on the northside (at the Sun office, for example) makes attendance near impossible for ITP folks. Clearly, around the Perimeter is the best bet, buteverything we have found so far is expensive. So, if you have an idea for a location on the top end of the perimeter that's cheap,accessible, and available, please let me know.And to anyone in Atlanta, we look forward to seeing you on May 9!

I can't believe that I've let things go so far away from me that my last post was in November. Here it is May already! Lots of news from ATLOSUG (Atlanta OpenSolaris User Group). Since November, we...

Solaris

All I can say is Wow!

The Atlanta OpenSolaris User Group kicked off last night with just over 50 attendees! There were about 30 who had signed up beforehand, and I would have been happy with 20 for this first meeting. I was floored that we had SRO. All the food was gone; all the soda was gone; all the shirts were gone! The crowd braved the fierce Atlanta traffic to convene at the Crowne Plaza Ravinia hotel. Our future meetings will be held on campus at Georgia Tech, where we hope that students will get involved with OpenSolaris. As it turns out, the Atlanta Telecom Professionals were having their annual awards Gala at the same hotel, so it really was a case of braving the crowds and traffic.But, just over 50 people turned out from all over town. Customers, iForce partners, recruiters, integrators, universities, commercial, Sun engineers - all sorts of folks.As this was an organizational event, we talked about meeting mechanics, frequency, etc. As I said, our future meetings, held the 2nd Tuesday of odd-numbered months, will be in the Georgia Tech Student Center in mid-town Atlanta at 7:00PM, with networking and refreshments starting around 6:30. We're taking a lesson from the venerable Atlanta Unix Users Group and not trying to get complicated or fancy in our structure. Each meeting will include time for discussion, Q&A, and a presentation. We invite partners to sponsor meetings and help defray the cost of the refreshments, etc.Our kickoff presentation was an overview of OpenSolaris. Much thanks to Jerry Jelinek, whose slides provided a lot of background. You can find a recap of the meeting with photos and the slides here.I think we're off to a great start! We have sponsors fighting over who gets to sponsor upcoming meetings, and we have speakers volunteering for most of the next year already! We may have to meet more frequently to get the speakers in. Thanks so much to the folks who have been a great help, and will continue to be a great help - Crystal Nichols from Intelligent Technology Systems for covering logistics, and to George Wilson and Don Deal from Sun's Sustaining Engineering group for technical backup.We'll see everyone at the next meeting on January 10!

The Atlanta OpenSolaris User Group kicked off last night with just over 50 attendees! There were about 30 who had signed up beforehand, and I would have been happy with 20 for this first meeting. I...

Everything Else

Dixie is excited about Solaris 10!

It's been a long time since my last blog. I feel like I have been on the road now for months, travelling about the South, talking about Solaris. Every where I go, folks are excited about Solaris 10, Open Solaris, Solaris running on x86 / x64. After being an OS Ambassador at Sun for 10 years, it's finally fashionable to focus on the OS. And that's a lot of fun.Many of us OS Ambassadors have been presenting roughly six hours of Solaris 10 technical overview at a series of Solaris 10 Boot Camps held across the country. If one is in your area, try to take it in. Even if you are already a Solaris junkie, this is a great way to meet other folks in your area also interested in S10. For the most part, these events are hosted by colleges and universities and held on-campus, but are open to the community at large. I've been doing these in Florida, so far, and plan to travel to Mississippi for one next week. Who would have thought that there would be excitement for Solaris 10 in Mississippi?! But, we have between 50 and 100 people signed up in Jackson, MS for our event. There are a couple of these events planned for the West Coast (San Diego and Santa Clara, I think), as well as Atlanta, in the next month or so.On top of the Boot Camps, I have visited dozens of customers, both very, very large, and very, very small, and everything in-between. Even if the customer is not currently running Solaris, Solaris 10, especially running on x86 hardware, is something that \*they call us to hear about\*! I've visited customers in Georgia, Tennessee, Florida, Alabama, Virginia in the last couple of months and the reception has always been the same - "This is way cool!"It looks like my task for the next year will be focused around helping customers get Solaris 10 integrated into their environments. That's a pleasant task, to my way of thinking. It's times like this that continue to make me happy to be at Sun and happy to be associated with Solaris and the people who make Solaris possible.

It's been a long time since my last blog. I feel like I have been on the road now for months, travelling about the South, talking about Solaris. Every where I go, folks are excited about Solaris 10,...

Everything Else

CEC Final Morning

Monday morning was the final half-day of the Sun Customer Engineering Conference. We rounded out this morning with two breakout sessions and some time chatting with Scott McNealy.The breakouts I made this morning were particularly good. I started with Liane Praza giving a talk about SMF for Administrators. SMF is one of the particularly powerful new sets of features in Solaris 10 and Liane gave a great presentation on how all the pieces fit together. I can see a lot of promise for ISVs integrating their application software with SMF for higher levels of availability. One question that came to mind is "what sort of applications would be well served by having their own custom delegated restarter?" One possible area I thought of would be telco network applications. These sorts of apps often require special processing and go to great lengths to provide very high levels of reliability. Maybe having a delegated restarter based on the particular types of transactions, these core network apps could provide even higher levels of reliability.My second breakout was another view of server consolidation, this one being a session on lessons learned through an internal project to move internal applications to consoldiated environments using zones. One thing that comes out over and over is that no matter the server consolidation approach being used, planning and operational maturity are the key components to a successful deployment. One interesting comment from this session is that the group doing the deployment felt like they could make better progress and show positive ROI more quickly by approaching things in small, achievable chunks - 20-30 apps at a time, rather than a huge enterprise-wide analysis and deployment.We finished out the morning with a presentation by Scott McNealy, with a pretty good question and answer session. Like Jonathan's, Scott's talk is always a highlight of this event. I believe that the senior executives at Sun really value the contribution, and understand the significance of the contribution, of the technology organizations in the field.All in all, this was a great event. I'm definately heading home with a big list of things to try to work on in my _copious spare time._ There are so many gems hidden down in Solaris that deserve attention so that I can share them with my customers. It's sort of like the guy who works at the hardware store. He has to know from his own experience the basics of what he sells, but he also has to know from study and listening to other customers what all of the other mysterious and arcane items he might have in stock do and how to use them.Now, time to move to meeting two - OS Ambassadors for most of the rest of the week. That's always an exciting meeting. But for both of these, you end up tired! As invigorated as I always am after the meetings, I am also glad to get home!

Monday morning was the final half-day of the Sun Customer Engineering Conference. We rounded out this morning with two breakout sessions and some time chatting with Scott McNealy.The breakouts I made...

Everything Else

CEC Day 2

Day Two of the CEC is Sunday, 2/27. Like other days, we begin with general sessions, but I missed the early ones. I "cut class" and went to church at Glide Memorial United Methodist Church. Great service and I am glad that I went. More about that later.Finally got to the CEC in time to hear Andy Bechtolsheim and John Fowler's general session about where Sun is going with Opteron servers. David Yen, EVP of Sun's Scalable Systems Group, explained how CMT works and where Sun is going with our upcoming CMT systems.Right after lunch, I had the second round of my BART presentation. Pretty good turnout, I think. Probably about 35 people. BART is one of those little gems in Solaris that people overlook.After my talk, I caught several good talks this afternoon. First one was on the new way that Sun will distribute updates for Solaris 10. This looks to be a real improvement over the current tools and practices. The second talk was on metering and accounting resource usage for utility computing, aka in this case chargeback. The key here is extended accounting and its ability to report usage by task, project, or zone. Exacct is something that I have been intending to look more closely at for a while. Now, I think it's time to do that. Third talk was about the Fault Manager in Solaris 10 given by Mike Shapiro. The more I look at FMA and hear the plans for this, the more impressive this technology is. One more half day tomorrow morning, finishing up with a visit with Scott McNealy. Last year at CEC, Scott (like Jonathan this year) was very open with us. I'm looking forward to that.But, CEC is only the first part of the week for me. Tuesday to Thursday, the OS Ambassadors, a group of roughly 50 Solaris specialists worldwide , will meet or a short mini-conference. We are taking advantage of the fact that most of us are in town for CEC to catch up for a few days. So, looks like a busy week, too.

Day Two of the CEC is Sunday, 2/27. Like other days, we begin with general sessions, but I missed the early ones. I "cut class" and went to church at Glide Memorial United Methodist Church. Great...

Everything Else

CEC Day 1

Each day at the CEC start with general session, presentations from the leaders and executives at Sun. This gives all of us the opportunity to hear what is the execs are saying to customers and what they are up to. This morning, we started out with our hosts, Dr. Jim Baty, CTO of the Client Solutions organization, and Hal Stern, CTO of Client Network Services.Next up, Bob MacRitchie, Executive VP of Global Sales, and Marissa Peterson, EVP for Client Network Services, gave us an update on the state of the union from their vantage point.The highlight of the morning was a very open and candid question and answer session with Jonathan Schwartz. Someone wrote in a blog not to long ago, and I aplogize for forgetting the author, that one of the really great things about being at Sun is the freedom to speak your mind and ask the tough questions up and down the line. The group of folks here at the CEC are the engineers who are with customers every day, who see the ultimate effects of decisions made at the top levels of the company. And this group is anything but shy. Jonathan gave us a very brief talk about where he sees Sun, what the company's priorities are, and where we are going. Then, in a move not many company presidents would do, he opened the floor to an hour of honest question and answer in a room filled with 3000 opinionated engineers.Jonathan met every question head on. He didn't dodge or discount anyone's opinion. In fact, he often amplified the feelings expresses, saying that he had hear that same issue from other people, from other customers. And he validated people's feelings, letting us know that he has had many of the same frustrations as we in the field face. He talked honestly about what has gone into many difficult decisions over the last few years. He talked honestly about where we and our produces have been and where we are going.I've been at Sun for a long time, and it's things like this that keep me here and keep me excited and optimistic about being at Sun. I'm as opinionated as the next guy, and I have my own ideas about what's good and bad at Sun, but when our top executives talk to us honestly, it really makes me glad to be here.Break Out SessionsThe big problem at the CEC is the fact that there are so many sessions that look like they will be really good and so little time. You have to pick and choose carefully. You have to move quickly and be aggressive to get into the most exciting talks. I tried to get into the talk on Sun's new Update Manager, the new mechanism for delivering patches and system updates. But every seat, every spot along the wall was filled and another dozen people were crowded around the door trying to hear the talk. I guess I have to get the slides for that on.I got to hear Claire Giordano talk about Open Solaris and all that we are doing there. This is a talk I have heard before, but I find it valuable to hear the conversation around Open Solaris. After all in the open source world, it often is as much about the conversation as it is about the source. As soon as Claire's talk was over, the doors burst open and hundreds of folks rushed in to try to get seats. Turns out the next presentation in that room was Andy Bechtolsheim, Sun badge number 1. Everyone wants to hear what he is up to.Next up was Grand Holland and Ed Turner, a couple of local Atlanta guys, talking about a project they have been working on called Service Configuration and Deployment Engine. This is a pretty cool effort to glue together packages and processes in the deployment and management of services and servers. I mostly wanted to see this since I've loaned a rack of gear in the Atlanta lab to Grant's team and I've wondered what they are doing. It's always fun to hear these guys. Grant is so wicked smart, and someone always asks a question that sets him to thinking of a whole new set of opportunities, that it's always a good show.The last talk I saw on Saturday was about the Solaris on x86 boot process. Like many sessions, this was packed. Everyone knows all about how SPARC systems boot, but many of us are just becoming familiar with the ins and outs of BIOS and x86 boot process.For the last session of the day, it was time for me to give my talk on BART. I had a pretty good crowd, but certainly not SRO. BART is the basic audit & reporting tool in Solaris 10. It's a simple tool that allows you to detect any changed files on a system.Pretty full day for sure.

Each day at the CEC start with general session, presentations from the leaders and executives at Sun. This gives all of us the opportunity to hear what is the execs are saying to customers and what...

Everything Else

Customer Engineering Conference - Day 0

Arrived in San Francisco early in the day on Friday for CEC2005. This first day (well, zeroth day) of the CEC is really just to get folks here, get reacquainted with folks you've not seen in a while. We had a short session on "The State of the Union" where each of the organizations involved in the CEC had a chance to meet with its leadership and get an update on what's going on. For my group, Client Solutions, this was particularly interesting, being a new organization this year. At the last Network Computing launch event and Analyst Conference, Chris Ostertag, Sr. VP for Client Solutions, introduced this organization to the analyst community. CSO is the result of bringing together the field-based presales engineering team with the professional services delivery team. This is a pretty major and far-raching undertaking. So, for the CSO State of the Union, each of the heads of the various disciplines gave a brief update on and make sure it matches the voice coming from the trenches.Spent the rest of the evening at the Welcome reception, reconnecting with folks. The best part of a big, worldwide event like this is getting to network with people that you don't often see, otherwise. Seems like everyone I talked to is finding huge interest from their customers in Solaris 10. Lots of us who work in that space are moving from spending our time introducing Solaris 10 to doing more in-depth engagements to talk about how exactly a customer might move forward with a consolidation effort using zones, or might use DTrace to optimize an application, or might take advantage of the Service Management Facility for better application management on their systems.Rounded out the day with a bit of planning for the rest of the weekend. Sessions kick of for real at 8:00 AM on Saturday morning. There are hundreds of breakout sessions given by my peers. One of the good things about the new Client Solutions organization is that new job roles help focus which sessions will be most useful. I'm now focused on Solaris and smaller servers. While I still think big servers are cool, I can skip those and narrow my focus a little. Even so, there are still far more sessions than I can attend. Luckily, I think we are recording sessions so I can catch them later on. At the very least, I can get the slides.My tentative list of topics that I want to see today has been narrowed down to:\* Volume Server Performance Analysis\* OpenSolaris\* Sun's New Update Manager\* Compliance - Technical view of Lifecycle Management\* Reducing Service time with the Diagnistic Boot CD\* A Basic Introduction to Reliable Computing\* New & Upcoming Server Hardware\* DTrace - This is Clive King and Jon Haslam. Definately don't want to miss it!\* The Solaris x86 boot process\* Transitioning NIS to LDAP and Issues\* Performance monitory for Solaris on SPARC, Solaris on X86, and Linux\* Solaris 10 Service Management Facility\* Trusted Solaris - Simple, Powerful Security\* BART - Basic Audit Reporting Tool\* Report on Sun's internal POC with Zones and Solaris 10From that big list, I think I get to pick 5 if I want to stay the whole session. More on what I see later.

Arrived in San Francisco early in the day on Friday for CEC2005. This first day (well, zeroth day) of the CEC is really just to get folks here, get reacquainted with folks you've not seen in a...

Everything Else

Team Meeting in Newark This Week

I'm part of the Solaris x86 and Volume Server practice within US Client Solutions. You might have seem some of the other folks talking about CSO in their posts. This is a new organization this year designed to bring together the technical folks in the field responsible for both pre-sales and delivery of solutions.This team is made up of about 25 folks spread across the US. I've been part of distributed groups before, but never a national group like this. I've probably met less than half of my co-workers so far. This coming week, we will all get together at the Sun campus in Newark, CA for the first time. I'm looking forward to finally putting a face to the voices from our team calls.It's funny. For every change like this new team and organization, there are always pluses and minuses. It's good to be part of a team focused around such cool technology as Solaris. But moving to a nationwide team rather than a local team does seem to force some extra effort to build a team out of a bunch of dispersed folks. This is one of those cases, I think, that this will work, though. We've got the coolest product going as our focus. I think the only downside here is that it's going to take more than the 25 of us to handle all of the projects that our customers will come up with.

I'm part of the Solaris x86 and Volume Server practice within US Client Solutions. You might have seem some of the other folks talking about CSO in their posts. This is a new organization this year...

Everything Else

Solaris Boy on Tour

Here at Sun, engineers in the field - pre-sales, professional services,even some support services - have specialty programs calledAmbassadors.  I've been an OS Ambassador for about 10 years. Every time a new Solaris release comes around, we get extra busy. This year, I'm spending a huge amount of time as "Solaris Boy", givingpresentations and workshops around the South on Solaris.  The lastfew weeks, I've talked in Birmingham, Memphis, Knoxville, Atlanta (lotsof times), Redmond WA.  I'm planning a swing through Florida soon.Seems like everyone wants to hear about Solaris.   SolarisContainers always gets the biggest share of the excitement.  Itcertainly is cool.  And simple.  Zones let you isolate yourworkloads for whatever reason you might have.  You might want todelegate administration, restrict access to parts of the system, sharethe system between different organizations.  You choose.  But, in a world where there are so many things in life that areover-sold and under-delivered, it seems like parts of Solaris areunder-sold and over-delivered.  That is, they are extremelypowerful, but often overlooked.  Little things that make your lifeeasier.  Like RBAC, or User Rights Management as it's called inSolaris 10.  RBAC is like sudo on steroids.  Give out rootcapabilities on a very restricted basis to whomever needs it, but don'tgive out root password.  Combine it with Solaris audit capabilityto track who does what.  In these days of Sarbanes-Oxley, this isa big deal.  No extra add-on products, just regular old Solaris.And you don;'t have to look far to find more examples.  By the way, I am sure others have mentioned this already, but there is a nice, free, web-based training class from SunEd on New Features in the Solaris 10 Operating System.This is cool!  SunEd training prior to the release of theOS.  There's also a 5-day instructor-led class on Solaris 10system administration and a 2-day DTrace class.  Sign up early and often!

Here at Sun, engineers in the field - pre-sales, professional services, even some support services - have specialty programs called Ambassadors.  I've been an OS Ambassador for about 10 years. Every...

Oracle

Integrated Cloud Applications & Platform Services