Thursday Mar 20, 2008

Open HA Cluster Tutorium bei den Frühjahrsfachgesprächen der GUUG

Am 13. März 2008 haben Hartmut Streppel und ich zusammen ein Tutorium über "Open HA Cluster" und "Flying Container" auf den Frühjahrsfachgesprächen 2008 an der Hochschule München abgehalten. Die Veranstaltung wird jährlich von der German Unix User Group organisiert.

Der erste Teil des Tutorium ist nun verfügbar, die 65 Seiten enthalten die Präsentation und Sprecherkommentare in Deutscher Sprache. Hier die Agenda im Überblick:

  1. Einführung in die Solaris™ Cluster Architektur
    • Core Cluster Framework
    • Data Services (Agenten)
    • Geographic Edition
  2. Einführung in das Open HA Cluster Projekt
    • HA Clusters Community Group
    • Open HA Cluster Agenten + aktuelle Projekte
    • Open HA Cluster Geographic Edition
    • Build Demo
  3. Einführung in die Template basierende Agentenentwicklung
PS: Der zweite Teil des Tutorium ist nun auch verfügbar.

Friday Feb 22, 2008

Solaris Cluster Express 02/08 available!

Solaris Cluster Express 02/08 is available for download. It is build to run on Solaris Express Developer Edition 01/08. Have a look at the Release Notes for more details. Two things I want to highlight:

  1. This combination (SCX 02/08 and SXDE 01/08) is a good fit to get started and engaged with the HA xVM project, since SXDE 01/08 enables to configure the Sun xVM hypervisor for x86-64.
  2. SCX 02/08 delivers the HA Informix agent to manage Informix Dynamic Server, which is the first data service developed within the HA Clusters Community Group with community participation.
Happy Clustering!

Friday Feb 15, 2008

GDS coding template with technical presentation released

We released the GDS coding template under the Common Development and Distribution License (CDDL) Version 1.0, along with a technical presentation, which can also be delivered as a workshop. It gives an introduction into cluster agent development and then explains the GDS coding template in great detail.

The GDS coding template was developed by several engineers (including Neil Garthwaite, Detlef Ulherr and myself) and is based on experience and enhancements collected during the development of several standard agents in the past 3-4 years, which are also part of Open HA Cluster today.

The GDS coding template and the presentation is available on:

It enables you to focus on the application specifics during custom agent development and allows a rapid development cycle.

If you develop a custom agent by using this GDS coding template, we also encourage you to contribute it back. We will be happy to add an entry to the third party HA agent repository or even sponsor an OpenSolaris project for it.

For any feedback or questions feel free to ask on the ha-clusters-discuss mailing list.

Wednesday Feb 13, 2008

OpenSolaris Test Development Project endorsed by HA Clusters community group

The OpenSolaris Test Development Project just got announced, which is also endorsed by the HA Clusters community group. The interesting part is that this OpenSolaris project is also endorsed by the OS/Net (ON), Storage and Testing community group, which highlights an important goal of the project to really keep the various OpenSolaris test suites together under one umbrella.

From a Clustering perspective we already know very well that Clustering touches a lot of layers (both, within the hardware and the software stack) and brings individual products together, which then need to cooperate and work different as they would in a standalone isolated manner. So I welcome this effort and look forward to its progress!

Thursday Dec 13, 2007

Call for 3rd party HA agents

Did you ever create an agent (also called data service) for Sun Cluster and maybe even published it on your blog or own website?

Did you ever wonder if someone wrote an custom agent for an application where Sun Cluster does not provide an standard agent for?

Then you might be interested to know that we created a repository for third party HA agents on the Open HA Cluster community portal. Anyone can add an entry there by providing the necessary informations for the table and sending them to the ha-clusters-discuss mailing list. The intent is to give those who have written custom agents the possibility to share the knowledge of existence with the community, and to give those who are in need of an custom agent a place to first look if there is something to leverage on.

Obviously such a repository only lives through participation, so happy HA agent sharing! :-) 

Wednesday Dec 12, 2007

IRC chat room #OHAC opened for conversation

Today I send out an announcement to ha-clusters-discuss that the contributors of the HA Clusters community agreed to create #OHAC, an IRC chat room available on

You can simply join in order to discuss any topic related to HA Clusters with your peers and interested parties. As time permits this channel will also get participation of existing contributors. Like with any chat, don't be disappointed if you don't get an immediate response - sometimes "real work" distracts from having fun ;-)

I also setup a brief page on the HA Clusters community portal with some hints on how to get started.

Of course this chat room only lives through participation, so see you all there!


Monday Dec 03, 2007

OHAC Tech Day demo - why duplicate IP is not helping...

For the folks attending today the presentation "Discovering Open High Availability Cluster" within the OpenSolaris Day at the Sun Tech Days in Frankfurt, wondering why my little live demo got the problem not being able to even ping the local configured IP ( on bge0), here is the reason:

Since my demo did just require bge0 to have its IP link up (since otherwise the IPMP group used for Solaris Cluster Express would be down), I plugged in the cable from the speaker table. And I did prepare the demo using and So my fault was to not check if those IPs are already being used - Murphy strikes - and I had bad luck:

Dec  3 16:09:53 kosh ip: [ID 567813 kern.warning] WARNING: bge0 has duplicate address (in use by 00:30:1b:b6:12:20); disabled
Dec  3 16:09:53 kosh in.routed[372]: [ID 238047 daemon.warning] interface bge0 to turned off

And of course after I stopped the resource group, unconfigured bge0 in order to configure the IP manual again, the same thing happened (no surprise there):

Dec  3 16:11:42 kosh Cluster.PNM: [ID 914440 daemon.error] sc-net has been deleted.
Dec  3 16:11:42 kosh If sc-net was hosting any HA IP addresses then these should be restarted.
Dec  3 16:11:55 kosh mac: [ID 486395] NOTICE: bge0 link down
Dec  3 16:11:56 kosh mac: [ID 435574] NOTICE: bge0 link up, 100 Mbps, full duplex
Dec  3 16:12:09 kosh ip: [ID 567813 kern.warning] WARNING: bge0 has duplicate address (in use by 00:30:1b:b6:12:20); disabled
Dec  3 16:12:09 kosh in.routed[372]: [ID 238047 daemon.warning] interface bge0 to turned off
Dec  3 16:12:31 kosh in.mpathd[254]: [ID 975029 daemon.error] No test address configured on interface bge0; disabling probe-based failure detection on it

Since I did not want to waste a lot of time with debugging during the talk I did not spot this rather obvious issue. So sorry for the not working demo, hope the talk was useful anyway! If you really want to watch the demo (to see that it really works if you don't screw the IP setup ;-), you can refer to my blog explaining how to watch the presentation given at the SourceTalk 2007 event in Göttingen, where I used the same demo (and had network working).

Wednesday Nov 14, 2007

SourceTalk 2007 Reloaded

In einem früheren Blog habe ich für meine Präsentation auf den Source Talk Tagen 2007 an der Universität Göttingen geworben. Einige Präsentationen wurden dabei live mit dem TeleTeachingTool (TTT) aufgezeichnet - darunter auch "Hochverfügbarkeit mit Open HA Cluster".

Es gibt eine Liste von aufgezeichneten Vorträgen, vom Open HA Cluster Vortrag gibt es eine Version ohne Video (25 MB) und mit Video (82 MB). Eine Anleitung, wie man die notwendige Software (Java, Java Media Framework, TTT) zum abspielen installiert, ist verfügbar für Linux, Windows und MacOS. TTT spielt dabei das aufgezeichnete Video mit Ton synchronisiert zu den Slides.

Für Solaris (SPARC und x64/x86) habe ich folgende Vorgehensweise ausprobiert, dabei wird das bereits vorhandene Java und Java Media Framework von Solaris benutzt. Mein System lief uner Solaris Express Community Edition build 69:

  1. Tele teaching Tool runterladen (jar archiv) und nach /usr/local/lib/ttt/ kopieren:
    $ mkdir -p /usr/local/lib/ttt
    $ cp ttt.jar /usr/local/lib/ttt/

  2. Das Skript ttt.ksh runterladen und nach /usr/local/bin/ kopieren:
    $ cp ttt.ksh /usr/local/bin/
    $ chmod 755 /usr/local/bin/ttt.ksh

  3. Präsentation (hier mit Video) runterladen (z.B. nach /var/tmp).

  4. Archiv auspacken:
    $ cd /var/tmp
    $ unzip

  5. Präsentation starten:
    $ /usr/local/bin/ttt.ksh 2007_09_19-01.ttt

Die Präsentation Virtualisierung mit OpenSolaris von Ullrich Gräf (ohne Video (250MB), mit Video (311 MB)) sollte man sich auch nicht entgehen lassen!

Viel Spass beim anschauen!

Saturday Nov 10, 2007

Two technical presentations published at the Open HA Cluster community page

Two technical presentations got published on the Open HA Cluster community web page. A new section got added to the Documentation area.

The first was given to the Cluster Technology Forum in August 2005. At that time, SC 3.1 08/05 with the HA-Container agent was just released. The presentation explains how some SUNW.gds-based agents are changed in order to work in the traditional RGM model as well as within a non-global zone, which is under HA-Container agent control. The content still applies to the GDS-based agents today with Sun Cluster 3.2 and Open HA Cluster. It helps to understand the code flow for the different deployment scenarios with zones.

The second was given to the Customer Engineering Conference (CEC) in October 2006. It explains the two principle models on how Sun Cluster 3.2 interacts with Solaris Zones. Both models are explained and contrasted with decision criteria as to which model to choose, based on application and service requirements.

Monday Nov 05, 2007

New Open HA Cluster Agent source code and Solaris Cluster Express 10/07 released

Last week Sun released the third source drop for Open HA Cluster Agents (ohacds-src-20071030.tar.bz2), together with the Solaris Cluster Express 10/07 release.

The updated ohacds agent gate is also available through OpenGrok. You will find an overview of the integrated changes within the changelog.

You can study the latest changes within the HA Container agent, which is now also able to handle non-global zones of brand type solaris8 on the SPARC platform (currently available for Solaris 10 10/07 as an unbundled product called Solaris 8 Migration Assistant). Patch 120590-06 makes those changes available for Sun Cluster 3.1 08/05. Support for Sun Cluster 3.2 will get introduced with the next update release.

Sunday Sep 02, 2007

Open HA Cluster auf den Source Talk Tagen 2007

Vom 17.-19. September 2007 finden zum dritten mal die Source Talk Tage an der Universität Göttingen statt. Veranstalter sind die Sun User Group Deutschland e.V. und die Java User Group Deutschland e.V..

Am 19. September steht das Programm u.a. unter dem Motto "OpenSolaris und Java - Systeme und Technologien". Dort werde ich von 10:15-11:15 Uhr und 13:30-15:30 Uhr ein Installfest anbieten, bei welchem Besucher auf einem geeigneten Laptop Unterstützung bei der Installation von Solaris Express Community Edition (build 68 oder neuer) und Solaris Cluster Express 07/07 und Konfiguration eines single node Cluster bekommen. Wer schon vorher mit Vorbereitungen beginnen möchte, orientiert sich am besten an der "How to Install a Single-Node Cluster" Anleitung.

Von 11:30-12:30 Uhr gebe ich eine Präsentation mit dem Titel "Open Solaris: Hochverfügbarkeit mit Open HA Cluster".

Zum Abschluss gibts von 15:30-17:00 Uhr noch einen OpenSolaris "meet the experts" Praxisworkshop, bei dem Besucher meinem Kollegen Ulrich Gräf und mir Fragen rund um das Thema OpenSolaris und im speziellen zu den präsentierten Themen stellen können. Bei Interesse kann hier auch gerne das eine oder andere live am Laptop demonstriert werden.

Man sieht sich also am 19. September 2007 in Göttingen! :-) 

Thursday Aug 30, 2007

New Open HA Cluster Agent source code released

The second source drop got released for the ohacds gate (ohacds-src-20070816.tar.bz2).

Highlights are the source release of two more agents:

And the ability to now also build the source using gcc: /opt/scbld/bin/nbuild -Da CW_NO_SHADOW=# __GNUC=

There is also a changelog available to summarize the included Bug fixes and RFE since the first source drop (ohacds-src-20070626.tar.bz2).

Saturday Jul 28, 2007

Solaris Cluster Express 07/07 is available!

Solaris Cluster Express 07/07 is available for download. Have a look at the installing instructions for more details.

Here are some reasons why I think this should matter to you:

  • Solaris Cluster Express is based on the development build for the next Solaris Cluster release. It already contains some new features integrated after Solaris Cluster 3.2. You get early access to new features and can report bugs and RFEs (request for enhancements) before the new product gets out. This is a great way to influence the product to make sure it reflects your needs.
  • Solaris Cluster Express runs on Solaris Express Community Edition, this means you can combine all the new features available for the development build of Solaris with the development build of Solaris Cluster and make sure it behaves like you expect.
  • Solaris Cluster Express comes with 32bit kernel support for the x86 platform. You can now leverage your laptop to install a single-node cluster, without the need to own 64bit x86 hardware. Please report back if your specific hardware combination works to the HA Cluster Discuss forum!
  • ISVs can now test early if their applications run and work as expected on the next generation of Solaris and Solaris Cluster platform and can engage early to make sure their needs get considered.
  • Developers can start to work and implement their own great ideas that are getting possible with the new features and ideally contribute them back.

Thursday Jul 05, 2007

Tricking applications which bind to nodename only with

If you read through the Sun Cluster Data Services Developer's Guide for Solaris OS, you will find the requirements for non-cluster aware applications in Appendix E:

  1. Multihosted Data
  2. Host Names
  3. Multihomed Hosts
  4. Binding to INADDR_ANY as Opposed to Binding to Specific IP Addresses
  5. Client Retry

You can also read about analyzing the application for suitability.

For this blog number 2. is of special interest - if your application is somehow depending on the physical node name of a server (ie. the name that gets returned by hostname or uname -n), and does not offer the possibility to configure to use a logical host name instead, than the library provided with Sun Cluster 3.2 might help you out.

The referenced man page has all the information needed with examples how to use it within C source and shell based agent code.

You can also find examples of its usage in the Open HA Cluster source code within the Oracle E-Business Suite and Oracle Application Server agent by using the search interface and browse through the results.

Wednesday Jun 27, 2007

HA clusters community is born!

Today is a very exciting day for me, since Solaris Cluster is going Open Source, read the official announcement!

A new community got created at the OpenSolaris site: HA Clusters, and Open High Availability (HA) Cluster is part of it.

The first phase is to open source the data services (also refered to as agents) gate for most of the existing standard agents. The source can be viewed here using OpenGrok.

Since part of my working area are the GDS based agents, I invite you to take an overview :-)

Further I invite anyone with an interest of high availablity clusters to subscribe and contribute at the ha-cluster-discuss discussion list!

Be there or be square!  


This Blog is about my work at Availability Engineering: Wine, Cluster and Song :-) The views expressed on this blog are my own and do not necessarily reflect the views of Sun and/or Oracle.


« June 2016