Wednesday Apr 28, 2010

Last call for the paranoid git

I joined Sun on July 11th 1995, so I'm very close to making 15 years now that I'm being assimilated into Oracle on May 1st - it is a pity I didn't make it all the way.

I've been posting entries in this blog since April 4th 2004, and now that I'm being sucked into the huge beast that Oracle is, I'm going to end this blog and start posting stuff at instead, to allow me to speak freely without fear of violating some corporate policy of what I can and can't say.

Code Nursery is where I spend my spare time - working on open source projects I find interesting, like:

Or helping friends and family build websites for their interest groups or small businesses - just for the mental exercise and that I like keeping myself up to date with new technology!

According to the "Proprietary Information Agreement" I signed today, Oracle wants to claim everything I do that falls under "any current or reasonably anticipated business of Oracle" ... "whether or not conceived during regular business hours". Luckily that doesn't hold up in court in Sweden, they can only claim stuff that I do at work which is related to what I'm hired to do.

However, to continue contributing to open source projects I have to be really careful not to mix business with pleasure, so that is yet another reason for me to move away from the corporate site, and use my hardware on my spare time from now on.

For those of you who come here for Solaris auditing information - I'm gradually going to review all those posts I've made over the years and do a "clean-room rewrite" of them at the new blog, so keep your eyes peeled if you are an audit junkie like me :)

That's all I had to say - now mosey on over to the new site and read about how I access it securely...


Thursday Nov 05, 2009

Behavior Driven Infrastructure

One problem I'm wrestling in my day job at Web Engineering is: how do you know when a system you are building is ready?

When we build a new system, it goes through the following steps:

  1. Jumpstart
    Installs the OS and sets up basic configuration, like hostname, domainname, network.
  2. Puppet
    System specific configuration
  3. Manual steps
    This includes things which are too system dependent to automate, like creating a separate zpool for application data on external storage

For me it has been enough to review the puppet logs to determine if the system has been correctly configured, but for my colleagues who aren't using puppet on a daily basis, it isn't. They have been asking "how do we know if a system is ready?", and I've realized that "review the puppet logs" isn't really a helpful answer for most people. What if you have forgotten to add a node definition for the system, and you get the default node configuration. Then puppet will tell you everything is configured correctly - which is partly true: the things puppet has been told to configure are configured, but what about the stuff I forgot to tell it about?

So I've been thinking about using the same approach as I use when I write code: Behavior Driven Development. I.e. you start by specifying the behavior of the program you are developing, after that you start you start to code. This has the benefit of easily letting you known when you are done. If your code pass all the behavior tests, then you can release it.

Translating this to Solaris installs isn't that hard, instead of describing program behavior you describe (operating) system behavior. You can use the same tools as you do for development, and I've been using cucumber for my Ruby on Rails projects, so it is what I picked for my initial testing. Cucumber uses natural language to describe the behavior you want, which makes it easy for non-programmers to understand what it is testing.

When you write the definitions, you should not use technical language, like: "ssh to the host weblogs and grep for an passwd(4) entry for the user martin in /etc/passwd" instead use something like "I should be able to ssh to weblogs, and log in as the user martin", which is the behavior you want. Cucumber then takes that definition and translates it into step-by-step instructions which can be validated.

This is how it can look when you run it:

martin@server$ cucumber
Feature: sendmail configure
  Systems should be able to send mail

  Scenario: should be able to send mail                  # features/
    When connecting to using ssh   # features/steps/ssh_steps.rb:12
    Then I want to send mail to "" # features/steps/mail_steps.rb:1

Feature: NIS client
  Systems on SWAN should be NIS clients

  Scenario: should be able to match entries in NIS    # features/
    When connecting to using ssh # features/steps/ssh_steps.rb:12
    Then I want to lookup "xuan" in the passwd table   # features/steps/nis_steps.rb:1
    And I want to lookup "onnv" in the hosts table     # features/steps/nis_steps.rb:1

  Scenario: should be able to make lookups through NIS # features/
    When connecting to using ssh # features/steps/ssh_steps.rb:12
    Then I want to lookup "xuan" through nsswitch.conf # features/steps/nis_steps.rb:5

Feature: SSH access
  SSH should be configured

  Scenario: ssh user access                            # features/
    Given a user named "martin"                        # features/steps/ssh_steps.rb:3
    When connecting to using ssh # features/steps/ssh_steps.rb:12
    Then the connection should succeed                 # features/steps/ssh_steps.rb:28

  Scenario: no lingering default OpenSolaris user      # features/
    Given a user named "jack" with password "jack"     # features/steps/ssh_steps.rb:7
    When connecting to using ssh # features/steps/ssh_steps.rb:12
    Then the connection should fail                    # features/steps/ssh_steps.rb:32

5 scenarios (5 passed)
13 steps (13 passed)

This makes it really easy to see if the behavior of the system is what you expect. All green means it is ready!

The stuff I am working on at the moment is to make the failures understandable by a non-programmer. For example when a scenario fails (and it succeeds to log in to a system where it should have failed), it looks like this:

  Scenario: no lingering default OpenSolaris user      # features/
    Given a user named "jack" with password "jack"     # features/steps/ssh_steps.rb:7
    When connecting to using ssh # features/steps/ssh_steps.rb:12
    Then the connection should fail                    # features/steps/ssh_steps.rb:28
      expected not nil, got nil (Spec::Expectations::ExpectationNotMetError)
      ./features/steps/ssh_steps.rb:29:in `/\^the connection should succeed$/'
      features/ `Then the connection should succeed'

Failing Scenarios:
cucumber features/ # Scenario: no lingering default OpenSolaris user

5 scenarios (1 failed, 4 passed)
13 steps (1 failed, 12 passed)

It is not obvious that expected not nil, got nil means that it could log in when it shouldn't be able to, so I am working on some custom rspec matchers to generate better error messages.

Once I've gotten a bit beyond playing around with this, I will publish the source if someone is interested in it.

Tuesday Jun 23, 2009

Planning to fail when using Puppet

We put a lot of thought into planning for failure when we setup our sites (like, and so on). Every component is redundant, from border firewalls to load-balancers to front end web servers to root disks. We even put the gear in separate racks on separate power, just in case someone accidentally knocks both power cables out. This is arranged in odd and even sides, and servers are placed in the corresponding side, i.e. is placed on the odd side and is placed on the even side. If we use more than two servers they are added to the respective side.

But the chain is only as strong as its weakest link: if I screw up when I update the puppet profile for our base server class, things will quickly go south.

No matter how carefully I test things before I commit my changes to the master mercurial repository and on to the puppetmaster (we only ran one per site before), there still is a chance things go boink! There are always some servers which were setup a few years ago, long before we started using puppet, that aren't installed and configured the way I expect, and when they are modified by puppet - they break!

So it doesn't matter that we are running multiple systems, they all get changed by puppet within 30 minutes.

To work around this problem I've set up two puppetmasters, and they serve the corresponding side (odd or even). This lets me push changes to the one side first, let it stew for a while, before I push it to the other side.

Monday Mar 23, 2009

Yubico on Solaris 10

I'm back configuring Yubikeys but this time on Solaris 10 as it is what the majority of our servers run.

Here are are the steps required to get it working on Solaris 10 update 6:

  1. Install curl
    pkgadd SFWcurl
  2. Configure libyubico-client
    configure CPPFLAGS=-I/opt/sfw/include CFLAGS-std=c99 --prefix=/usr
  3. Compile and install
    gmake install
  4. Configure pam_yubico
    configure --prefix=/usr --without-ldap
  5. Compile and install
    gmake install
  6. Setup a user to key mapping file (e.g. /etc/yubikeys)
  7. Configure /etc/pam.conf
    other   auth requisite
    other   auth required 
    other   auth required 
    other   auth required  id=16 authfile=/etc/yubikeys ignorepass

Then a ssh login will look like this:

martin@workstation$ ssh server
Yubikey for `martin': 

You might have noticed the ignorepass option which I have added, this is to prevent pam_yubico from trying to (re)use the password I typed, nd instead force pam_yubico to prompt me for it. I have sent Simon the diff so he can add it to the next release.

Tuesday Mar 03, 2009

Running puppet on OpenSolaris

I'm running puppet on the production servers I manage at Sun, and for Solaris 10 I've had to compile Ruby and create my own package (for easy distribution). I've also created my own puppet and facter packages, as I didn't want to setup rubygems.

Now on OpenSolaris this is much easier, as you can just run:

# pkg install -q SUNWruby18
# gem install -y puppet
Bulk updating Gem source index for:
Successfully installed puppet-0.24.7
Successfully installed facter-1.5.4
Installing ri documentation for puppet-0.24.7...
Installing RDoc documentation for puppet-0.24.7...
and you are all set to configure /etc/puppet/puppet.conf to get puppetmasterd and puppetd running!

Friday Jan 16, 2009

Testing the Yubico Yubikey

I've been looking at different solutions for two-factor authentication (as in something you have) to use as a backup to what Sun IT provides us. Since we run two data centers outside of Sun, and require two-factor authentication to log on to all our external servers, we are often prevented from logging on as the network path back to the Sun IT verification servers is down. So we need a backup solution that allows us to do the verification in our data center when the network is down.

The top contender for this is Yubico's yubikey which I think is a very cool device. And the best part of it, is that all software needed to do the verification is open sourced!

I've compiled and on OpenSolaris with the help of Simon as we had to make some minor adjustments to get it compiled on Solaris.

I've made some additional minor modifications to to let me use it for two-factor authentication (I'll post the diffs later).

This is how the authentication looks now:

martin@mbp$ ssh puppet-tst2
Password: my normal UNIX passphrase
Yubikey: the output from the yubikey
martin@puppet-tst2 $ 

I'm very pleased with the results of my tests so far, and if you are looking at a two-factor authentication, buy a few of them and git it a try...

Thursday Jan 15, 2009

Audit chapter

As I wrote before, I've written the audit chapter for an upcoming Solaris Security book. The chapter is now available on Safari Rough Cuts and feedback is very welcome...

Friday Dec 12, 2008

A new Solaris security book on the way

For the last few months I've been spending my evenings tapping away on the keyboard - but not producing code or managing Solaris servers like I usually do. I've been writing two chapters for an upcoming Solaris security book! It has been fun, but it has also been hard - not hard because I didn't know what to write, but hard to constrain myself from wanting to include too much.

The book is not intended to cover every nitty gritty detail of every security feature in Solaris - that would make it a real brick of a book! So I've had to think hard about what to include, and the level of detail of the included parts.

Parts of the book is already available on Safari Rough Cuts for review before we publish. Please leave comments about on the Safari site so that nothing gets lost.

The chapter about File System Security is mine, and I've also authored the chapter about auditing (not very surprising), though it hasn't been processed for publication yet, but when it is - I'll post a blog entry with a link to it.

Wednesday Dec 10, 2008

Sendmail, may I introduce Alteon to you?

Yesterday we started using an Alteon VIP to load balance SMTP traffic to our two mail servers, and everything was fine and dandy, but when I took a look in /var/log/syslog I found loads of entries like this:

Dec 11 18:17:14 prod-git1 sendmail[20899]: [ID 801593] j93FHDNX020899: []
did not issue MAIL/EXPN/VRFY/ETRN during connection to MTA

The Alteon health check connects and then just issue a QUIT which sendmail finds suspicious, and hence feels obliged to let me know about it. This becomes very annoying when you have two Alteons doing the check every other second!

After scratching my head for a while and searching for a solution, I came across this patch to sendmail, which lets you select systems which shouldn't generate the above log entry. The only caveat was that I'd have to build my own sendmail, and I really don't want to roll my own stuff as it require more job to support, so I continued to look for a another solution.

I finally figured out (after reading the sendmail sourcode) that if I in /etc/mail/ set

O PrivacyOption=authwarnings,needexpnhelo,needvrfyhelo

sendmail would be quiet if the Alteon changed the health check to doing the equivalent of this:

mconnect localhost
connecting to host localhost (, port 25
connection open
220 ESMTP Sendmail 8.13.8+Sun/8.13.8; Thu, 11 Dec 2008 13:58:48 +0100 (CET)
VRFY root
503 5.0.0 I demand that you introduce yourself first
221 2.0.0 closing connection

So we changed the health check from being smtp to a custom script (note that you need the double backslashes):

open 25,tcp
expect "ESMTP"
send "VRFY root\\\\n"
expect "503"
send "QUIT\\\\n"
expect "221"

And after pushing this change out, sendmail stopped filling the log with messages I don't want to see.

Tuesday Jun 24, 2008

EU wants to control bloggers

I just read an article which scared the hell out of me! There is an EU proposal to require a government controlled registry of blogs. No more anonymous blogging!

"there is a need to clarify their status, and to create legal safeguards for use in the event of lawsuits as well as to establish a right to reply" in other words each blog needs to have an publisher, just like a newspaper, and this will require you to register with the local authorities. No more whistle blowing through blogs...

I wonder how they plan to enforce this? Especially in a case like mine where the server resides outside the EU.

Thursday May 08, 2008

The decay of the Swedish model

I just read a good blog entry about the decay of the Swedish model which touched on many subjects discussed during dinner today.

People here (in Sweden) seem to think the someone else will take care of it. It is not my problem, but someone ought to do something about it. Why isn't the government doing anything about it?

Why not do it yourself?

Creating a user_attr puppet type

I've come a fair bit in my puppet testing now, but one thing I lack is a user_attr type. I.e. a way to update the /etc/user_attr file using puppet.

This is what I have in mind for the syntax:

user_attr { "martin":
    type => normal,
    roles => [
    profiles => "Zone Management",
    auths => [

One thing I haven't figured out yet is how if the definitions should be absolute, i.e. if the entry must be exactly like the definition, or if it is enough that the listed values are present. In the above example, should the role list be exactly root,admin or should it just make sure that those two roles are in the list and you can have the role audit too. Perhaps it would be good to be able to use the absent/present syntax on individual items?

I haven't decided if I'm going to manage the other user attributes too, e.g. project, defaultpriv, limitpriv and lock_after_retries. I will probably leave that for a later release...

[Technorati Tags: ]

Friday Apr 18, 2008

Testing puppet configurations

I've set up a puppet environment which uses mercurial to store the configuration and manifests. Now I'm trying to build an environment to be able to test changes before I commit them to the repository, and they propagate to all our 400 servers - but I encountered a problem.

You can use a separate configuration directory with the --confdir option for both puppetd and puppetmasterd, and run everything on localhost, but the problem is the source parameter

file { "/etc/profile":
    owner => root,
    group => root,
    mode => 644,
    source => "puppet://server/base/profile"

The above source parameter contains the hostname, so when I want to test it on my local mercurial repository, it still connects to the server instead of localhost when it fetches the files.

Luckily there is a solution! If you leave out the server part, puppetd will insert the name of the server it is connecting to.

Tuesday Apr 08, 2008

Trying out puppet

I'm looking for ways to better manage our servers, and right now I'm playing with puppet.

I immediately ran in to a problem: it picked the wrong domain name. Internally at Sun we use NIS (yes, I know it is insecure and sucks in almost all aspects, but I'm not in position to change it - and believe me I have tried) and our NIS domain name doesn't match the DNS domain name.

This is something puppet (facter to be exact) doesn't figure out, at least not on Solaris. Instead of picking the correct fqdn for a host, e.g., it picks, since that is what the domainname command returns.

They tried to fix this, but unfortunately it doesn't work for Solaris, as it relies on the dnsdomainname which we don't have.

I've worked around it by creating my own /usr/bin/dnsdomainname which gets called before domainname.

DOMAIN="`/usr/bin/domainname 2> /dev/null`"
if [ ! -z "$DOMAIN" ]; then
    echo $DOMAIN | sed 's/\^[\^.]\*.//'

So now I can continue to test my puppet configurations...

Monday Apr 07, 2008

Shopping list

I'm headed to California and Menlo Park on Friday, and my wife has as usual given me a shopping list :)

With the dollar as lows as it is, I'm going to do some shopping myself. I'm going to buy a Time Capsule, not that I really need an extra 1 TB disk, but the rest of my family (whom I've converted to Mac) never remember to turn on the external disk I've attached to their computer - so Time Machine is useless!

And for myself, I'm going to get two 1 TB disks for my Drobo, which is 97% full at the moment. I'm squirreling away too much, but it is hard to throw away stuff... I've even got things stashed away on other external disks, but that data isn't mirrored which I don't like.

Since I got a digital video camera, I never seem to have enough disk space. I can't wait until our house is built and I can set up my U40 as a file server - 8 \* 1 TB should last at least until the end of 2008 ;)




« July 2016