Pass the buck (or a message) using motd & EEPROM...

Another nifty challenge that we had on this project was juggling resources. We had about 8 machines in our development lab, and about 30 engineers working on them at any given time. One machine is reserved as the Jumpstart/JET server, one as the N1SPS/N1SM server, and two for cluster development. One machine was mostly dedicated to the "golden flar image" development, leaving three machines for everyone else to scramble for as target server system for deployment and functional/unit testing.

This qualifies as the hack of the day for me. We could all send emails around, but the volume and conflicting needs would be overwhelming. We could (and did) maintain a spreadsheet and whiteboard of who had what assets reserved for what periods of time. We could (and did) keep the assets in the project plan docs, though "issues" and "need a machine for debugging this new error that just popped up" often makes those things out of date before they can be sent to the printer.

We ended up assigning a "group owner" to the machines as they were allocated. So the "Solaris" team might own a machine for a couple hours or a couple days. That person was responsible for knowing who was working on the machine, the state of the machine, and what tasks were to be completed before giving the machine back to the pool for reassignment.

Simple things like adding/deleting/cloning zones, messing with ndd settings, playing with secondary network adapters to configure IPMP aren't conflicts. Re-deploying or JET installing a machine that people are counting on could set teams back several hours or days. How can you pass a message other than having people run down the hall, and pop into the work rooms one at a time screaming "who just rebooted my machine, what's happening?"

You can place a message in the /etc/motd file (yeah, it is old school, but it works) letting everyone know what is happening on the machine. The "group owner" of the machine at any given time is responsible for placing notes there to keep incoming users informed as to what activitied and instabilities might exist on the machine, as well the the contact info for the machine owner.

somehost# cat /etc/motd

Machine:  somehost
IP:       10.68.75.9
Owner:    Bill Walker (703.555.1234)
Purpose:  JET/flar work

Current State:  
        Al:  Doing ndd settings testing
        Jim:  Working on Packaging tools
        Bill:  Working on Issues 118 and 181, SSH key mismatch issue
               Working on Issue 58, root homedir changes halfway
                   through provisioning process.

The other nifty item is the EEPROM's oem-banner variable. If the /etc/motd can inform folks coming in as to what is being done and by whom, the EEPROM oem-banner can alert folks who might be coming in through the ILOM and trying to re-provision or re-JET the machine out from under you. The last thing you want after working on a machine for 3-4 hours is to see "Connection closed", and find out that someone just typed "boot net - install" at the ok> prompt.

Historically, the OBP's oem-banner variables were created so that hardware manufacturers could "relabel" machines and resell them. Replacing the skin of the machine and putting a new logo on it was easy, but the console power-on message at POST time had a banner identifying the machine model and manufacturer, and on graphic (frame buffer) equipped systems, a graphic image depending on the machine and frame buffer type. The oem-banner and oem-logo variables allow machines to display different banners and graphic logos, essentially hiding the manufacturer of the machine for OEMs. We are going to appropriate these data elements for our own use in this case.

somehost# eeprom oem-banner="STOP!!!  System Group box for flar development
somehost> call before re-installing!  Bill Walker (703.555.1234)"
somehost# eeprom oem-banner?=true
somehost#

Now when someone goes to the console to re-install the box, they will see:

{0} ok 
{0} ok boot net - install



STOP!!!  System Group box for flar development
call before re-installing!  Bill Walker (703.555.1234)




Boot device: /pci@0/pci@0/pci@1/pci@0/pci@2/network@0  File and 
args: - install
1000 Mbps full duplex  Link up
Requesting Internet Address for 0:14:4f:d3:9a:0
Requesting Internet Address for 0:14:4f:d3:9a:0

Since the "Requesting Internet Address" stuff takes several minutes, they should have plenty of time to go back to the ILOM, issue a "break -y", and make the call before the disks get scribbled over.

Your mileage may vary, but I thought this was a good idea. :)


bill.


Comments:

Post a Comment:
Comments are closed for this entry.
About

mrbill

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today