Tuesday Oct 28, 2014

Heating Up Your OpenStack Cloud

As part of the support updates to Solaris 11.2, we recently added the Heat orchestration engine to our OpenStack distribution.  If you aren't familiar with Heat, I highly recommend getting to know it, as you'll find it invaluable in deploying complex application topologies within an OpenStack cloud.  I've updated the script tarball from my recent series on building the Solaris engineering cloud to include configuration of Heat, so if you download that and update your cloud controller to the latest SRU, you can run havana_setup.py heat to turn it on.

OK, once you've done that, what can you do with Heat?  Well, I've added a script and a Heat template that it uses to the tarball to give you at least one idea.  The script, create_image, is similar to a script that we run to create custom Unified Archive images internally for the Solaris cloud.  The basic idea is to deploy an OpenStack instance using the standard archive that release engineering constructs for the product build, add some things we need to it, then save an image of that for the users of the cloud to use as a base deployment image.  I'd originally written a script to do this using the nova CLI, but using a Heat template simplified it.  The simple.hot file in the tarball is the template that it uses; that template is a simpler version of a two-node template from the heat-templates repository.  It's fairly self-explanatory so I'm not going to walk through it here.

As for create_image itself, the standard Solaris archive contains the packages in the solaris-minimal-server group, a pretty small package set that really isn't too useful for anything itself, but makes a nice base to build images that include the specific things you need.  In our case, I've defined a group package that pulls in a bunch of things we typically use in Solaris development work: ssh client, LDAP, NTP, Kerberos, NFS client and automounter, the man command, and less.  Here's what the main part of the package manifest looks like:

depend fmri=/network/ssh type=group
depend fmri=group/system/solaris-minimal-server type=group
depend fmri=ldapcert type=group
depend fmri=naming/ldap type=group
depend fmri=security/nss-utilities type=group
depend fmri=service/network/ntp type=group
depend fmri=service/security/kerberos-5 type=group
depend fmri=system/file-system/autofs type=group
depend fmri=system/file-system/nfs type=group
depend fmri=system/network/nis type=group
depend fmri=text/doctools type=group
depend fmri=text/less type=group

In our case we bundle the package in a package archive file that we copy into the image using scp and then install the group package.  Doing this saves our developers a few minutes in getting what they need deployed, and that's one easy way we can show them value in using the cloud rather than our older lab infrastructure.  It's certainly possible to do much more interesting customizations than this, so experiment and share your ideas, we're looking to make Heat much more useful on Solaris OpenStack as we move ahead.  You can also talk to us at the OpenStack summit in Paris next week, a number of us will be manning the booth at various times when we're not in sessions at the design summit or the conference itself.

Oh, and for those who are interested, the Solaris development cloud is now up past 100 users and has 5 compute nodes deployed.  Still not large by any measure, but it's growing quickly and we're learning more about running OpenStack every day.

Friday Sep 19, 2014

Building an OpenStack Cloud for Solaris Engineering, Part 4

The prior parts of this series discussed the design and deployment of the undercloud nodes on which our cloud is implemented.  Now it's time to configure OpenStack and turn the cloud on.  Over on OTN, my colleague David Comay has published a general getting started guide that does a manual setup based on the OpenStack all-in-one Unified Archive, I recommend at least browsing through that for background that will come in handy as you deal with the inevitable issues that occur in running software with the complexity of OpenStack.  It's even better to run through that single-node setup to get some experience before moving on to trying to build a multi-node cloud.

For our purposes, I needed to script the configuration of a multi-node cloud, and that makes everything more complex, not the least of the problems being that you can't just use the loopback IP address (127.0.0.1) as the endpoint for every service.  We had (compliments of my colleague Drew Fisher) a script for single-system configuration already, so I started with that and hacked away to build something that could configure each component correctly in a multi-node cloud.  That Python script, called havana_setup.py, and some associated scripts are available for download.  Here, I'll walk through the design and key pieces.

Pre-work

Before the proper OpenStack configuration process, you'll need to run the gen_keys.py script to create some SSH keys.  These are used to secure the Solaris RAD (Remote Administration Daemon) transport that the Solaris Elastic Virtual Switch (EVS) controller uses to manage the networking between the Nova compute nodes and the Neutron controller node.  The script creates evsuser, neutron, and root sub-directories in whatever location you run it, and this location will be referenced later in configuring the Neutron and Nova compute nodes, so you want to put it in a directory that's easily shared via NFS.  You can (and probably should) unshare it after the nodes are configured, though.

Global Configuration

The first part of havana_setup.py is a whole series of global declarations that parameterize the services deployed on various nodes.  You'll note that the PRODUCTION variable can be set to control the layout used; if its value is False, you'll end up with a single-node deployment.  I have a couple of extra systems that I use for staging and this makes it easy to replicate the configuration well enough to do some basic sanity testing before deploying changes.,

MY_NAME = platform.node()
MY_IP = socket.gethostbyname(MY_NAME)

# When set to False, you end up with a single-node deployment
PRODUCTION = True

CONTROLLER_NODE = MY_NAME
if PRODUCTION:
    CONTROLLER_NODE = "controller.example.com"

DB_NODE = CONTROLLER_NODE
KEYSTONE_NODE = CONTROLLER_NODE
GLANCE_NODE = CONTROLLER_NODE
CINDER_NODE = CONTROLLER_NODE
NEUTRON_NODE = CONTROLLER_NODE
RABBIT_NODE = CONTROLLER_NODE
HEAT_NODE = CONTROLLER_NODE

if PRODUCTION:
    GLANCE_NODE = "glance.example.com"
    CINDER_NODE = "cinder.example.com"
    NEUTRON_NODE = "neutron.example.com"

Next, we configure the main security elements, the root password for MySQL plus passwords and access tokens for Keystone, along with the URL's that we'll need to configure into the other services to connect them to Keystone.

SERVICE_TOKEN = "TOKEN"
MYSQL_ROOTPW = "mysqlroot"
ADMIN_PASSWORD = "adminpw"
SERVICE_PASSWORD = "servicepw"

AUTH_URL = "http://%s:5000/v2.0/" % KEYSTONE_NODE
IDENTITY_URL = "http://%s:35357" % KEYSTONE_NODE

The remainder of this section configures specifics of Glance, Cinder,  Neutron, and Horizon.  For Glance and Cinder, we provide the name of the base ZFS dataset that each will be using.  For Neutron, the NIC, VLAN tag, and external network addresses, as well as the subnets for each of the two tenants we are providing in our cloud.  We chose to have one tenant for developers in the organization that is funding this cloud, and a second tenant for other Oracle employees who want to experiment with OpenStack on Solaris; this gives us a way to grossly allocate resources between the two, and of course most go to the tenant paying the bill.  The last element of each tuple in the tenant network list is the number of floating IP addresses to set as the quota for the tenant.  For Horizon, the paths to a server certificate and key must be configured, but only if you're using TLS, and that's only the case if the script is run with PRODUCTION = True.  The SSH_KEYDIR should be set to the location where you ran the gen_keys.py script, above.

GLANCE_DATASET = "tank/glance"
CINDER_DATASET = "tank/cinder"

UPLINK_PORT = "aggr0"
if PRODUCTION:
    VXLAN_RANGE = "500-600"
    TENANT_NET_LIST = [("tenant1", "192.168.66.0/24", 10),
                       ("tenant2", "192.168.67.0/24", 60)]
else:
    VXLAN_RANGE = "400-499"
    TENANT_NET_LIST = [("tenant1", "192.168.70.0/24", 5), 
                       ("tenant2", "192.168.71.0/24", 5)]

EXTERNAL_GATEWAY = "10.134.12.1"
EXTERNAL_NETWORK_ADDR = "10.134.12.0/24"
EXTERNAL_NETWORK_VLAN_TAG = "12"
EXTERNAL_NETWORK_NAME = "external"

SERVER_CERT = "/path/to/horizon.crt" SERVER_KEY = "/path/to/horizon.key"

SSH_KEYDIR = "/path/to/generated/keys"

Configuring the Nodes

The remainder of havana_setup.py is a series of functions that configure each element of the cloud.  You select which element(s) to configure by specifying command-line arguments.  Valid values are mysql, keystone, glance, cinder, nova-controller, neutron, nova-compute, and horizon.  I'll briefly explain what each does below.  One thing to note is that each function first creates a backup boot environment so that if something goes wrong, you can easily revert to the state of the system prior to running the script.  This is a practice you should always use in Solaris administration before making any system configuration changes.  It also saved me a ton of time in development of the cloud, since I could reset within a minute or so every time I had a serious bug.  Even our best re-deployment times with AI and archives are about 10 times that when you have to cycle through network booting.

mysql

MySQL must be the first piece configured, since all of the OpenStack services use databases to store at least some of their objects.  This function sets the root password and removes some insecure aspects of the default MySQL configuration.  One key piece is that it removes remote root access; that forces us to create all of the databases in this module, rather than creating each component's database in its associated module.  There may be a better way to do this, but since I'm not a MySQL expert in any way, that was the easiest path here.  On review it seems like the enable of the mysql SMF service should really be moved over into the Puppet manifest from part 3.

keystone

The keystone function does some basic configuration, then calls the /usr/demo/openstack/keystone/sample_data.sh script to configure users, tenants, and endpoints.  In our deployment I've customized this script a bit to create the two tenants rather than just one, so you may need to make some adjustments for your site; I have not included that customization in the downloaded files.

glance

The glance function configures and starts the various glance services, and also creates the base dataset for ZFS storage; we turn compression on to save on storage for all the images we'll have here.  If you're rolling back and re-running for some reason, this module isn't quite idempotent as written because it doesn't deal with the case where the dataset already exists, so you'd need to use zfs destroy to delete the glance dataset.

cinder

Beyond just basic configuration of the cinder services, the cinder function also creates the base ZFS dataset under which all of the volumes will be created.  We create this as an encrypted dataset so that all of the volumes will be encrypted, which Darren Moffat covers at more length in OpenStack Cinder Volume encryption with ZFS. Here we use pktool to generate the wrapping key and store it in root's home directory.  One piece of work we haven't yet had time to take on is adding our ZFS Storage Appliance as an additional back-end for Cinder.  I'll post an update to cover that once we get it done.  Like the glance function, this function doesn't deal propertly with the dataset already existing, so any rollback also needs to destroy the base dataset by hand.

nova_controller & nova_compute

Since our deployment runs the nova controller services separate from the compute nodes, the nova_controller function is run on the controller node to set up the API, scheduler, and conductor services.  If you combine the compute and controller nodes you would run this and then later run the nova_compute function.  The nova_compute function also makes use of a couple of helper functions to set up the ssh configuration for EVS.  For these functions to work properly you must run the neutron function on its designated node before running nova_compute on the compute nodes.

neutron

The neutron setup function is by far the most complex, as we not only configure the neutron services, including the underlying EVS and RAD functions, but also configures the external network and the tenant networks.  The external network is configured as a tagged VLAN, while the tenant networks are configured as VxLANs; you can certainly use VLANs or VxLANs for all of them, but this configuration was the most convenient for our environment.

horizon

For the production case, the horizon function just copies into place an Apache config file that configures TLS support for the Horizon dashboard and the server's certificate and key files.  If you're using self-signed certificates, then the Apache SSL/TLS Strong Encryption: FAQ is a good reference on how to create them.  For the non-production case, this function just comments out the pieces of the dashboard's local settings that enable SSL/TLS support.

Getting Started

Once you've run through all of the above functions from havana_setup.py, you have a cloud, and pointing your web browser at http://<your server>/horizon should display the login page, where you can login to the admin user with the password you configured in the global settings of havana_setup.py.

Assuming that works, your next step should be to upload an image.  The easiest way to start is by downloading the Solaris 11.2 Unified Archives.  Once you have an archive the upload can be done from the Horizon dashboard, but you'll find it easier to use the upload_image script that I've included in the download.  You'll need to edit the environment variables it sets first, but it takes care of setting several properties on the image that are required by the Solaris Zones driver for Nova to properly handle deploying instances.  Failure to set them is the single most common mistake that I and others have made in the early Solaris OpenStack deployments; when you forget and attempt to launch an instance, you'll get an immediate error, and the details from nova show will include the error:

| fault                                | {"message": "No valid host was 
found. ", "code": 500, "details": "  File 
\"/usr/lib/python2.6/vendor-packages/nova/scheduler/filter_scheduler.py\",
 line 107, in schedule_run_instance |

When you snapshot a deployed instance with Horizon or nova image-create the archive properties will be set properly, so it's only manual uploads in Horizon or with the glance command that need care.

There's one more preparation task to do: upload an ssh public key that'll be used to access your instances. Select Access & Security from the list in the left panel of the Horizon Dashboard, then select the Keypairs tab, and click Import Keypair.  You'll want to paste the contents of your ~/.ssh/id_rsa.pub into the Public Key field, and probably name your keypair the same as your username.

Finally, you are ready to launch instances.   Select Instances in the Horizon Dashboard's left panel list, then click the Launch Instance button.  Enter a name for the instance, select the Flavor, select Boot from image as the Instance Boot Source, and select the image to use in deploying the VM.  The image will determine whether you get a SPARC or x86 VM and what software it includes, while the flavor determines whether it is a kernel zone or non-global zone, as well as the number of virtual CPUs and amount of memory.  The Access & Security tab should default to selecting your uploaded keypair.  You must go to the Networking tab and select a network for the instance.  Then click Launch and the VM will be installed, you can follow progress by clicking on the instance name to see details and selecting the Log tab.  It'll take a few minutes at present, in the meantime you can Associate a Floating IP in the Actions field.  Pick any address from the list offered.  Your instance will not be reachable until you've done this.

Once the instance has finished installing and reached the Active status, you can login to it.  To do so, use ssh root@<floating-ip-address>, which will login to the zone as root using the key you uploaded above.  If that all works, congratulations, you have a functioning OpenStack cloud on Solaris!

In future posts I'll cover additional tips and tricks we've learned in operating our cloud.  At this writing we're over 60 users and growing steadily, and it's been totally reliable over 3 months, with only outages for updates to the infrastructure.

 
  

        
    

Tuesday Sep 16, 2014

Building an OpenStack Cloud for Solaris Engineering, Part 3

At the end of Part 2, we built the infrastructure needed to deploy the undercloud systems into our network environment.  However, there's more configuration needed on these systems than we can completely express via Automated Installation, and there's also the issue of how to effectively maintain the undercloud systems.  We're only running a half dozen initially, but expect to add many more as we grow, and even at this scale it's still too much work, with too high a probability of mistakes, to do things by hand on each system.  That's where a configuration management system such as Puppet shows its value, providing us the ability to define a desired state for many aspects of many systems and have Puppet ensure that state is maintained.  My team did a lot of work to include Puppet in Solaris 11.2 and extend it to manage most of the major subsystems in Solaris, so the OpenStack cloud deployment was a great opportunity to start working with another shiny new toy.

Configuring the Puppet Master

One feature of the Puppet integration with Solaris is that the Puppet configuration is expressed in SMF, and then translated by the new SMF Stencils feature to settings in the usual /etc/puppet/puppet.conf file.  This makes it possible to configure Puppet using SMF profiles at deployment time, and the examples in Part 2 showed this for the clients.  For the master, we apply the profile below:

<!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">
<!--
  This profile configures the Puppet master
-->
<service_bundle type="profile" name="puppet">
  <service version="1" type="service" name="application/puppet">
    <instance enabled="true" name="master">
      <property_group type="application" name="config">
        <propval name="server" value="puppetmaster.example.com"/>
        <propval name="autosign" value="/etc/puppet/autosign.conf"/>
      </property_group>
    </instance>
  </service>
</service_bundle>

The interesting setting is the autosign configuration, which allows new clients to have their certificates automatically signed and accepted by the Puppet master.  This isn't strictly necessary, but makes operation a little easier when you have a reasonably secure network and you're not handing out any sensitive configuration via Puppet.  We use an autosign.conf that looks something like:

*.example.com

This means that we're accepting any system that identifies as being in the example.com domain.  The main pain with autosigning is that if you reinstall any of the systems and you're using self-generated certificates on the clients, you need to clean out the old certificate before the new one will be accepted; this means issuing a command on the master like:

# puppet cert clean client.example.com

There are lots of options in Puppet related to certificates and more sophisticated ways to manage them, but this is what we're doing for now.  We have filed some enhancement requests to implement ways of integrating Puppet client certificate delivery and signing with Automated Installation, which would make using the two together much more convenient.

Writing the Puppet Manifests

Next, we implemented a small Mercurial source repository to store the Puppet manifests and modules.  Using a source control system with Puppet is a highly recommended practice, and Mercurial happens to be the one we use for Solaris development, so it's natural for us in this case.  We configure /etc/puppet on the Puppet master as a child repository of the main Mercurial repository, so when we have new configuration to apply it's first checked into the main repository and then pulled into Puppet via hg pull -u, then automatically applied as each client polls the master.  Our repository presently contains the following:

./manifests
./manifests/site.pp
./modules
./modules/nameservice
./modules/nameservice/manifests
./modules/nameservice/manifests/init.pp
./modules/nameservice/files
./modules/nameservice/files/prof_attr-zlogin
./modules/nameservice/files/user_attr
./modules/nameservice/files/policy.conf
./modules/nameservice/files/exec_attr-zlogin
./modules/ntp
./modules/ntp/manifests
./modules/ntp/manifests/init.pp
./modules/ntp/files
./modules/ntp/files/ntp.conf

An example tar file with all of the above is available for download.

The site manifest  starts with:

include ntp
include nameservice

The ntp module is the canonical example of Puppet, and is really important for the OpenStack undercloud, as it's necessary for the various nodes to have a consistent view of time in order for the security certificates issued by Keystone to be validated properly.  I'll describe the nameservice module a little later in this post.

Since most of our nodes are configured identically, we can use a default node definition to configure them.  The main piece is configuring Datalink Multipathing (DLMP), which provides us additional bandwidth and higher availability than a single link.  We can't yet configure this using SMF, so the Puppet manifest:

  • Figures out the IP address the system is using with some embedded Ruby
  • Removes the net0 link and creates a link aggregation from net0 and net1
  • Enables active probing on the link aggregation, so that it can detect upstream failures on the switches that don't affect link state signaling (which is also used, and is the only means unless probing is enabled)
  • Configures an IP interface and the same address on the new aggregation link
  • Restricts Ethernet autonegotiation to 1 Gb to work around issues we have with these systems and the switches/cabling we're using the in the lab; without this, we get 100 Mb speeds negotiated about 50% of the time, and that kills performance.
You'll note several uses of the require and before statements to ensure the rules are applied in the proper order, as we need to tear down the net0 IP interface before it can be moved into the aggregation, and the aggregation needs to be configured before the IP objects on top of it.
node default {
	
$myip = inline_template("<% _erbout.concat(Resolv::DNS.open.getaddress('$fqdn').to_s) %>")
    	
	# Force link speed negotiation to be at least 1 Gb
	link_properties { "net0":
	    ensure => present,
	    properties => { en_100fdx_cap => "0" },
	}
	
	link_properties { "net1":
	    ensure => present,
	    properties => { en_100fdx_cap => "0" },
	}

	link_aggregation { "aggr0" :
	    ensure => present,
	    lower_links => [ 'net0', 'net1' ],
	    mode => "dlmp",
	}
	
	link_properties { "aggr0":
	    ensure => present,
	    require => Link_aggregation['aggr0'],
	    properties => { probe-ip => "+" },
	}
	
	ip_interface { "aggr0" :
	    ensure => present,
	    require => Link_aggregation['aggr0'],
	}
	
	ip_interface { "net0":
	    ensure => absent,
	    before => Link_aggregation['aggr0'],
	}
	
	address_object { "net0":
	    ensure => absent,
	    before => Ip_interface['net0'],
	}
	
	address_object { 'aggr0/v4':
	    require => Ip_interface['aggr0'],
	    ensure => present,
	    address => "${myip}/24",
	    address_type => "static",
	    enable => "true",
	}
}

The controller node declaration includes all of the above functionality, but also adds these elements to keep rabbitmq running and install the mysql database.

    service { "application/rabbitmq" :
        ensure => running,
    }
    
    package { "database/mysql-55":
        ensure => installed,
    }

The database installation could have been part of the AI derived manifest as well, but it works just as well here and it's convenient to do it this way when I'm setting up staging systems to test builds before we upgrade.

The nameservice Puppet module is shown below.  It's handling both nameservice and RBAC (Role-based Access Control) configuration:

class nameservice {

    dns { "openstack_dns":
        search => [ 'example.com' ],
        nameserver => [ '10.2.3.4, '10.6.7.8' ],
    }

    service { "dns/client":
        ensure => running,
    }

    svccfg { "domainname":
        ensure => present,
        fmri => "svc:/network/nis/domain",
        property => "config/domainname",
        type => "hostname",
        value => "example.com",
    }

    # nameservice switch
    nsswitch { "dns + ldap":
        default => "files",
        host =>  "files dns",
        password => "files ldap",
        group => "files ldap",
        automount => "files ldap",
        netgroup => "ldap",
    }

    # Set user_attr for administrative accounts
    file { "user_attr" :
        path => "/etc/user_attr.d/site-openstack",
        owner => "root",
        group => "sys",
        mode => 644,
        source => "puppet:///modules/nameservice/user_attr",
    }

    # Configure zlogin access
    file { "site-zlogin" :
        path => "/etc/security/prof_attr.d/site-zlogin",
        owner => "root",
        group => "sys",
        mode => 644,
        source => "puppet:///modules/nameservice/prof_attr-zlogin",
    }

    file { "zlogin-exec" :
        path => "/etc/security/exec_attr.d/site-zlogin",
        owner => "root",
        group => "sys",
        mode => 644,
        source => "puppet:///modules/nameservice/exec_attr-zlogin",
    }

    file { "policy.conf" :
        path => "/etc/security/policy.conf",
        owner => "root",
        group => "sys",
        mode => 644,
        source => "puppet:///modules/nameservice/policy.conf",
    }
}

You may notice that the nameservice configuration here is exactly the same as what we provided in the SMF profile in part 2.  We include it here because it's configuration we anticipate changing someday and we won't want to re-deploy the nodes.  There are ways we could prevent the duplication, but we didn't have time to spend on it right now and it also demonstrates that you could use a completely different configuration in operation than at deployment/staging time.

What's with the RBAC configuration?

The RBAC configuration is doing two things, the first being configuring the user accounts of the cloud administrators for administrative access on the cloud nodes.  The user_attr file we're distributing confers the System Adminstrator and OpenStack Management profiles, as well as access to the root role (oadmin is just an example account in this case):

oadmin::::profiles=System Administrator,OpenStack Management;roles=root

As we add administrators, I just need to add entries for them to the above file and they get the required access to all of the nodes.  Note that this doesn't directly provide administrative access to OpenStack's CLI's or its Dashboard, that's configured within OpenStack.

A limitation of the OpenStack software we include in Solaris 11.2 is that we don't provide the ability to connect to the guest instance consoles, an important feature that's being worked on.  The zlogin User profile is something I created to work around this problem and allow our cloud users to get access to the consoles, as this is often needed in Solaris development and testing.  First, the profile is defined by a prof_attr file with the entry:

zlogin User:::Use zlogin:auths=solaris.zone.manage

We also need an exec_attr file to ensure that zlogin is run with the needed uid and privileges:

 zlogin User:solaris:cmd:RO::/usr/sbin/zlogin:euid=0;privs=ALL

Finally, we modify the RBAC policy file so that all users are assigned to the zlogin User profile:

PROFS_GRANTED=zlogin User,Basic Solaris User

The result of all this is that a user can obtain access to their specific OpenStack guest instance via login to the compute node on which the guest is running, and runing a command such as:

$ pfexec zlogin -C instance-0000abcd

At this point we have the undercloud nodes fully configured to support our OpenStack deployment.  In part 4, we'll look at the scripts used to configure OpenStack itself.


Friday Aug 22, 2014

Building an OpenStack Cloud for Solaris Engineering, Part 1

One of the signature features of the recently-released Solaris 11.2 is the OpenStack cloud computing platform.  Over on the Solaris OpenStack blog the development team is publishing lots of details about our version of OpenStack Havana as well as some tips on specific features, and I highly recommend reading those to get a feel for how we've leveraged Solaris's features to build a top-notch cloud platform.  In this and some subsequent posts I'm going to look at it from a different perspective, which is that of the enterprise administrator deploying an OpenStack cloud.  But this won't be just a theoretical perspective: I've spent the past several months putting together a deployment of OpenStack for use by the Solaris engineering organization, and now that it's in production we'll share how we built it and what we've learned so far.

In the Solaris engineering organization we've long had dedicated lab systems dispersed among our various sites and a home-grown reservation tool for developers to reserve those systems; various teams also have private systems for specific testing purposes.  But as a developer, it can still be difficult to find systems you need, especially since most Solaris changes require testing on both SPARC and x86 systems before they can be integrated.  We've added virtual resources over the years as well in the form of LDOMs and zones (both traditional non-global zones and the new kernel zones).  Fundamentally, though, these were all still deployed in the same model: our overworked lab administrators set up pre-configured resources and we then reserve them.  Sounds like pretty much every traditional IT shop, right?  Which means that there's a lot of opportunity for efficiencies from greater use of virtualization and the self-service style of cloud computing.  As we were well into development of OpenStack on Solaris, I was recruited to figure out how we could deploy it to both provide more (and more efficient) development and test resources for the organization as well as a test environment for Solaris OpenStack.

At this point, let's acknowledge one fact: deploying OpenStack is hard.  It's a very complex piece of software that makes use of sophisticated networking features and runs as a ton of service daemons with myriad configuration files.  The web UI, Horizon, doesn't often do a good job of providing detailed errors.  Even the command-line clients are not as transparent as you'd like, though at least you can turn on verbose and debug messaging and often get some clues as to what to look for, though it helps if you're good at reading JSON structure dumps.  I'd already learned all of this in doing a single-system Grizzly-on-Linux deployment for the development team to reference when they were getting started so I at least came to this job with some appreciation for what I was taking on.  The good news is that both we and the community have done a lot to make deployment much easier in the last year; probably the easiest approach is to download the OpenStack Unified Archive from OTN to get your hands on a single-system demonstration environment.  I highly recommend getting started with something like it to get some understanding of OpenStack before you embark on a more complex deployment.  For some situations, it may in fact be all you ever need.  If so, you don't need to read the rest of this series of posts!

In the Solaris engineering case, we need a lot more horsepower than a single-system cloud can provide.  We need to support both SPARC and x86 VM's, and we have hundreds of developers so we want to be able to scale to support thousands of VM's, though we're going to build to that scale over time, not immediately.  We also want to be able to test both Solaris 11 updates and a release such as Solaris 12 that's under development so that we can work out any upgrade issues before release.  One thing we don't have is a requirement for extremely high availability, at least at this point.  We surely don't want a lot of down time, but we can tolerate scheduled outages and brief (as in an hour or so) unscheduled ones.  Thus I didn't need to spend effort on trying to get high availability everywhere.

The diagram below shows our initial deployment design.  We're using six systems, most of which are x86 because we had more of those immediately available.  All of those systems reside on a management VLAN and are connected with a two-way link aggregation of 1 Gb links (we don't yet have 10 Gb switching infrastructure in place, but we'll get there).  A separate VLAN provides "public" (as in connected to the rest of Oracle's internal network) addresses, while we use VxLANs for the tenant networks.

Solaris cloud diagram

One system is more or less the control node, providing the MySQL database, RabbitMQ, Keystone, and the Nova API and scheduler as well as the Horizon console.  We're curious how this will perform and I anticipate eventually splitting at least the database off to another node to help simplify upgrades, but at our present scale this works.

I had a couple of systems with lots of disk space, one of which was already configured as the Automated Installation server for the lab, so it's just providing the Glance image repository for OpenStack.  The other node with lots of disks provides Cinder block storage service; we also have a ZFS Storage Appliance that will help back-end Cinder in the near future, I just haven't had time to get it configured in yet.

There's a separate system for Neutron, which is our Elastic Virtual Switch controller and handles the routing and NAT for the guests.  We don't have any need for firewalling in this deployment so we're not doing so.  We presently have only two tenants defined, one for the Solaris organization that's funding this cloud, and a separate tenant for other Oracle organizations that would like to try out OpenStack on Solaris.  Each tenant has one VxLAN defined initially, but we can of course add more.  Right now we have just a single /24 network for the floating IP's, once we get demand up to where we need more then we'll add them.

Finally, we have started with just two compute nodes; one is an x86 system, the other is an LDOM on a SPARC T5-2.  We'll be adding more when demand reaches the level where we need them, but as we're still ramping up the user base it's less work to manage fewer nodes until then.

My next post will delve into the details of building this OpenStack cloud's infrastructure, including how we're using various Solaris features such as Automated Installation, IPS packaging, SMF, and Puppet to deploy and manage the nodes.  After that we'll get into the specifics of configuring and running OpenStack itself.

Tuesday Jan 31, 2012

Detroit Solaris 11 Forum, February 8

I'm just posting this quick note to help publicize the Oracle Solaris 11 Technology Forum we're holding in the Detroit area next week.  There's still time to register and come get a half-day overview of the great new stuff in Solaris 11.  The "special treat" that's not mentioned in the link is that I'll be joining Jeff Victor as a speaker.  Looking forward to being back in my home state for a quick visit, and hope I'll see some old friends there!

Tuesday Nov 15, 2011

Solaris 11 Technology Forums, NYC and Boston

By now you're certainly aware that we released Solaris 11; I was on vacation during the launch so haven't had time to write any material related to the Solaris 11 installers, but will get to that soon.  Following onto the release, we're scheduling events in various locations around the world to talk about some of the key new features in Solaris 11 in more depth.  In the northeast US, we've scheduled technology forums in New York City on November 29, and Burlington, MA on November 30.  Click on those links to go to the detailed info and registration.  I'll be one of the speakers at both of them, so hope to see you there!

Friday Dec 18, 2009

Big 2009 Finish for OpenSolaris Installation

As the end of 2009 approaches, there are a bunch of recent developments in the OpenSolaris installation software that I want to highlight.  All of the below will appear in OpenSolaris development build 130, due in the next few days.

First up is the addition of iSCSI support to Automated Installation (or AI).  You can now specify an iSCSI target for installation in the AI manifest.  It'll work on both SPARC and x86, provided you have firmware that can support iSCSI boot; on SPARC you'll need a very recent OBP patch to enable this support.  Official docs are in the works, but the design document should have enough info to piece it together if you're interested.

Next is the bootable AI image, which allows use of AI in a number of additional scenarios.  Probably the most generally interesting one is that you can now install OpenSolaris on SPARC without setting up an AI server first, by using the default AI manifest that's included on the ISO image.  One caveat is that the default manifest installs from the release repository; due to ZFS version changes between 2009.06 and present, this results in an installation that won't boot.  You'll want to make a copy of the default manifest and change the main url for the ai_pkg_repo_default_authority element to point to http://pkg.opensolaris.org/dev and put it at a URL that you can supply to the AI client once it boots.  Alok's mail and blog entry have more details.

Building on bootable AI, we've extended the Distribution Constructor (or DC) with a project known as Virtual Machine Constructor (VMC).  Succinctly, it extends DC to construct a pre-built virtual machine image that can be imported into hypervisors that support OVF 1.0, such as VirtualBox or VMware.  Glenn's mail notes a few limitations that will be addressed in the next few builds.  Anyone interested in building virtualization-heavy infrastructures should find this quite useful.

Finally, one more barrier to adding OpenSolaris on x86 to a system that's multi-booted with other OS's has fallen with the addition of extended partition support to both the live CD GUI installer and Automated Installation.  You can now install OpenSolaris into a logical partition carved from the extended partition.  Jean's mail has some brief notes on how to use this new feature.  I should also note at this point that a couple of builds ago the parted command and GParted GUI were added to the live CD, so the more complex preparations sometimes needed to free up space for OpenSolaris can now be done directly from the CD.

I'd like to thank my team for all the hard work that went into all of the above; they accomplished all of it with precious little help from me, as I spent most of the past three months either traveling around talking to literally hundreds of customers or working on architecture and design tasks.  Speaking of those, the review of the installer architecture is open, and I've also just this week posted the first draft design for AI service management improvements.

What's next?  Well, that will be the topic of my next post in early January.  It's time for a vacation!

About

I'm the architect for Solaris deployment and system management, with a lot of background in networking on the side. I am co-author of the OpenSolaris Bible (Wiley, 2009). I also play a lot of golf.

Search

Archives
« April 2015
SunMonTueWedThuFriSat
   
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
  
       
Today
News

No bookmarks in folder

Blogroll

No bookmarks in folder