Monday Jul 14, 2008

LDoms guest domains supported as Solaris Cluster nodes

Folks, when late last year we announced support for Solaris Cluster in LDoms I/O domains on this blog entry , we also hinted about support for LDoms guest domains. It has taken a bit longer then we envisaged, but i am pleased to report that SC Marketing has just announced support for LDoms guest domains with Solaris Cluster!!

So, what exactly does "support" mean here? It means that you can create a LDoms guest domain running Solaris, and then treat that guest domain as a cluster node by installing SC software (specific version and patch information noted later in the blog) inside the guest domain and have the SC software work with the virtual devices in the guest domain. The technically inclined reader would, at this point, have several questions pop into his head... How exactly does SC work with virtual devices? What do i have to do to make SC recognize these devices? Are there any differences between how SC is configured in LDoms guest domains, vs non-virtualized environments? Read-on below for a high level summary of specifics:

  • For shared storage devices (i.e. those accessible from multiple cluster nodes), the virtual device must be backed by a full SCSI LUN. That means, no file backed virtual devices, no slices, no volumes. This limitation is required because SC needs advanced features in the storage devices to guarantee data integrity and those features are available only for virtual storage devices backed by full SCSI LUNs.

  • One may need to use storage which is unshared (ie is accessed from only one cluster node), for things such as OS image installation for the guest domain. For such usage, any type of virtual devices can be used, including those backed by files in the I/O domain. However, for such virtual devices, make sure to configure them to be synchronous. Check LDoms documentation and release notes on how to do that. Currently (as of July 2008) one needs to add "set vds:vd_file_write_flags = 0" to the /etc/system file in the I/O domain exporting the file. This is required because the Cluster stores some key configuration information on the root filesystem (in /etc/cluster) and it expects that the information written to this location is written synchronously to the disks. If the root filesystem of the guest domain is on a file in the I/O domain, it needs this setting to be synchronous.

  • Network based storage (NAS etc.) is fine when used from within the guest domain. Check cluster support matrix for specifics. LDoms guest domains don't change this support.

  • For cluster private interconnect, the LDoms virtual device "vnet" can be used just fine, however the virtual switch which it maps must have the option "mode=sc" specified for it. So essentially, for the command ldm subcommand add-vsw, you would add another argument "mode=sc" on the command line while creating the virtual switch which would be used for cluster private interconnect inside the guest domains. This option enables a fastpath in the I/O domain for the Cluster heartbeat packets so that those packets do not compete with application network packets in the I/O domain for resources. This greatly improves the reliability of the Cluster heartbeats, even under heavy load, leading to a very stable cluster membership for applications to work with. Note however, that good engineering practices should still be followed while sizing your server resources (both in the I/O domain as well as in the guest domains) for the application load expected on the system.

  • With this announcement all features of Solaris Cluster supported in non-virtualized environments are supported in LDoms guest domains, unless explicitly noted in the SC release notes. Some limitations come from LDoms themselves, such as lack of jumbo frame support over virtual networks or lack of link based failure detection with IPMP in guest domains. Check LDoms documentation and release notes for such limitations as support for such missing features are improving all the time.

  • For support of specific applications with LDoms guest domains and SC, check with your ISV. Support for applications in LDoms guest domains is improving all the time, so check often.

  • Software version requirements. LDoms_1.0.3 or higher, S10U5 and patches 137111-01, 137042-01, 138042-02, and 138056-01 or higher are required in BOTH the LDoms guest domains as well as in the I/O domains exporting virtual devices to the guest domains. Solaris Cluster SC32U1 (3.2 2/08) with patch 126106-15 or higher is required in the LDoms guest domains.

  • Licensing for SC in LDoms guest domains follows the same model as those for the I/O domains. You basically pay for the physical server, irrespective of how many guest domains and I/O domains are deployed in that physical server.
  • This covers the high level overview of how SC is to be deployed inside the LDoms guest domains. Check out the SC Release notes for additional details, and some sample configurations. The whole virtualization space is evolving very rapidly and new developments are happening ever so quickly. Keep this blog page bookmarked and visit it frequently to find out how Solaris Cluster is evolving along with this space.

    Cheers!

    Ashutosh Tripathi
    Solaris Cluster Engineering

    Wednesday Jun 04, 2008

    Solaris Cluster Express 6/08 available for download

    Solaris Cluster Express 6/08 is now available for download! You can download the DVD image here

    What is new in this release?

    \* This release runs on OpenSolaris Nevada build 86. The version of the Sun Management Centre is now 3.1.

    \* The HA agent for Solaris Containers is now enhanced to include support for the Solaris 9 Branded Zones on SPARC platform. This is very useful for those customers who still need to run some applications on Solaris 9 while taking advantage of the new features of Solaris 10 and above.

    \* The HA agent for PostgreSQL Database is now ehanced to support WAL shipping. This feature greatly enhances the deployment of PostgreSQL database in Enterprise deployments.

    \* Support for Solaris Containers configured with exclusive IP is included in this release.

    \* The SCX Geographic Edition is enhanced to support Oracle Data Guard based replication.

    \* This release also contains the mandatory bug fixes and other minor enhancements not mentioned above.

    Stay tuned for more milestones along the open source journey!

    Munish Ahuja
    Madhan Kumar B.
    Jonathan Mellors
    Arun Kurse
    Venugopal N.S.

    Thursday May 29, 2008

    Two Million Lines of Code

    One year ago, we announced that we would open source the entire Solaris Cluster product suite. Today, we are delivering on that promise six months ahead of schedule by releasing over two million lines of source code for the Solaris Cluster framework!


    Read the official press release and listen to a podcast with Meenakshi Kaul-Basu, Director of Availability Products at Sun.


    This third, and final, source code release follows the initial open sourcing of the Solaris Cluster agents in June, 2007 and Solaris Cluster Geographic Edition in December, 2007. As with the previous releases, the source code is available under the Common Development and Distribution License (CDDL) under the auspices of the HA Clusters community group on OpenSolaris.org.


    The open source version of Solaris Cluster is called Open High Availability Cluster. Although some encumbered parts of Solaris Cluster have not been open sourced, with this release, you can now build a fully functional HA Cluster purely from source.


    In addition to the source code for the product itself, Open HA Cluster includes source for the Sun Cluster Automated Test Environment (SCATE), man pages, and globalization.


    Consider getting involved in the HA Clusters community group:




    Nick Solter, Open HA Cluster tech lead and HA Clusters community group facilitator

    Sunday Oct 28, 2007

    Announcing Availability of Solaris Cluster Express 10/07

    We are proud to announce the availability of the second revision of the Solaris Cluster Express today. The earlier release was Solaris Cluster Express  7/07.  You can download the DVD image from http://www.sun.com/download/products.xml?id=472142a3 .


    What is new in this release?


    This release runs on the Solaris Nevada build 70b (SXDE 9/07).  It also includes many bug fixes for cluster core, framework, management and the following important enhancements for the Solaris Cluster Express Agents:


    1.) The HA containers agent has been enhanced to support non-native Containers.  Solaris Containers for Linux Applications (aka Lx Zones) can be made HA using the HA Containers agent.  The agent also includes support for Solaris8 Containers on the SPARC platform.


    2.) The HA JES Application server agent (SUNW.jsas, SUNW.jsas-na) has been enhanced to support the Open Source Glassfish version.

     

    3.) Bug fixes for agents.  The updated list can be found at http://www.opensolaris.org/os/community/ha-clusters/ohac/Documentation/Agents/ohacds_changelog/ .

    4.) This release also includes the recently open sourced Websphere Message Broker and Websphere MQ agents.


    Stay tuned for more updates in the coming months as part of our commitment to Open Source software. Please visit
    http://www.opensolaris.org/os/community/ha-clusters/ohac/ to get more details about the Open HA Cluster Community!

    Munish Ahuja
    Madhan Kumar B
    Jonathan Mellors
    Venugopal Ns

    Friday Apr 20, 2007

    High-Availability Parenting

    As I was putting my son to bed the other night, it occurred to me that my wife and I practice “high-availability parenting.” What does that mean?

    First, some background. Those of you familiar with Solaris Cluster know that, in the technology world, there are a number of services that need to be available basically all the time. Think of the credit card infrastructure: When you run your card at the gas station, the database in which your credit limit, current balance, etc. are stored had better be online! Otherwise your card will be rejected, you'll try a different card, and the first bank will lose your business on that transaction.

    However, because of hardware failure rates, software bugs, people tripping on power cords, etc., a single machine isn't reliable enough for important applications like your bank's credit card database. To provide higher availability, these applications can be run on a group, or cluster, of physical machines, with software that provides automatic failover of the application from one computer to the next in the case of failure (hardware or otherwise). One example of this kind of High-Availability Cluster software is, of course, Solaris Cluster.

    Now back to the kids. Young children require 100% availability of the “parenting service”. They need someone on call constantly for food, entertainment, safety, etc. Both my wife and I can provide these services, so we switch off. In this way, we're like a two-node cluster. When one of us is “unavailable” (programming, cleaning, relaxing, etc) the other is “active.” Most of our “downtime” is of the planned variety – sort of like a planned software upgrade. However, there's certainly some unplanned downtime that comes up now and then – if one of us wakes up with a fever, the other takes over (most likely at the expense of the “upgrades” planned for that day)!

    To take the analogy a bit further than it should probably go, what happens when we both plan downtime simultaneously, say, to go to a movie? Luckily we have a backup two-node cluster nearby: my in-laws!

    Nicholas Solter
    Solaris Cluster Development

    About

    mkb

    Search

    Archives
    « April 2014
    SunMonTueWedThuFriSat
      
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
       
           
    Today