Saturday Oct 03, 2009

A Dashboard Like No Other: The OpenDS Weather Station

<script type="text/javascript"> var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www."); document.write(unescape("%3Cscript src='" + gaJsHost + "' type='text/javascript'%3E%3C/script%3E")); </script> <script type="text/javascript"> try { var pageTracker = _gat._getTracker("UA-12162483-1"); pageTracker._trackPageview(); } catch(err) {}</script>


 Doing so many benchmarks, profiling and other various performance related activities, I had to find a way to "keep an eye" on things while fetching emails, chatting on IM and the like. Having some experience in past projects with microcontrollers, although on Windows, I figured I could put together a little gizmo to help me keep tabs on my Directory Server.

Bird's Eye View

This is basically a simple setup with a USB Bit Whacker controlled by a Python script, feeding it data crunched from various sources, mainly the Directory Server access log, the garbage collection log and kstats... the result is a useful dashboard where I can see things happen at a glance.

The Meat

Everything starts with the USB Bit Whacker. It's a long story, but to cut short, a couple a years ago, Kohsuke Kawaguchi put together an orb that could be used to monitor the status of a build / unit tests in Hudson. Such devices are also know as eXtreme Feedback Devices or XFDs. Kohsuke chose to go with the USB Bit Whacker (UBW) for it is a USB 'aware' microcontroller that also draws power from the bus, and is therefore very versatile while remaining affordable ($25 soldered and tested from sparkfun but you can easily assemble your own). A quick search will tell you that this is a widely popular platform for hobbyists.

 On the software side, going all java would have been quite easy except for the part where you need platform specific libraries from the serial communication. Sun's javacomm library or rxtx have pros and cons but in my case, the cons were just too much of a hindrance. What's more, I am not one to inflict myself pain unless it is absolutely necessary. For that reason, I chose to go with Python. While apparently not as good on cross-platformedness compared to Java, installing the Python libraries for serial communication with the UBW is trivial and has worked for me right off the bat on every platform I have tried, namely: Mac OS, Linux and Solaris. For example, on OpenSolaris all there is to it is:

 $ pfexec easy_install-2.4 pySerial
Searching for pySerial
Best match: pyserial 2.4
Processing pyserial-2.4.tar.gz
Running pyserial-2.4/ -q bdist_egg --dist-dir /tmp/easy_install-Y8iJv9/pyserial-2.4/egg-dist-tmp-WYKpjg
zip_safe flag not set; analyzing archive contents...
Adding pyserial 2.4 to easy-install.pth file

Installed /usr/lib/python2.4/site-packages/pyserial-2.4-py2.4.egg
Processing dependencies for pySerial
Finished processing dependencies for pySerial

 that's it! Of course, having easy_install is a prerequisite. If you don't, simply install setuptools for your python distro, which is a 400kB thing to install. You'll be glad you have it anyway.

Then, communicating with the UBW is mind boggingly easy. But let's not get ahead of ourselves, first things first:

Pluging The USB Bit Whacker On OpenSolaris For The First Tim

The controller will appear as a modem of the old days and communicating with equates to sending AT commands. For those of you who are used to accessing Load Balancers or other network equipment through the serial port, this is no big deal.

In the screenshot below, the first ls command output shows that nothing in /dev/term is an actual link, however, the second -which I issued after plugging the UBW on the usb port- shows a new '0' link has been created by the operating system.

Remember which link your ubw appeared as for our next step: talking to the board.

Your First Python Script To Talk To The UBW

I will show below how to send the UBW the 'V' command which instructs it to return the firmware version, and we'll see how to grab the return value and display it. Once you have that down, the sky is the limit. Here is how:

from serial import \*
ubw = Serial("/dev/term/0")
print "Requesting UBW Firmware Version"
print "Result=["+ubw.readline().strip() + "]\\n"

Below is the output for my board:


That really is all there is to it, you are now one step away from your dream device. And it really is only a matter of imagination. Check out the documentation of current firmware to see what commands the board supports and you will realize all the neat things you can use it for: driving LEDs, Servos, LCD displays, acquiring data, ...

Concrete Example: The OpenDS Weather Station

As I said at the beginning of this post, my initial goal was to craft a monitoring device for OpenDS. Now you have a good idea of how I dealt with the hardware part, but an image is worth a thousand words so here is a snap...

On the software front, well, being a software engineer by trade, that was the easy part so that's almost not fun and I won't go inot as much detail but here is a 10,000ft view:

  • data is collected in a matrix of hash tables.
  • each hash table represent a population of data points for a sampling period
  • an individual time thread pushes a fresh list of hash tables in the matrix so as to reset the counters for a new sampling period

So for example, if we want to track CPU utilization, we only need to keep one metric. The hash table will only have one key pair. Easy. Slightly overkill but easy. Now if you want to keep track of transactions response times, the hash table will keep the response time (in ms) as a key and the number of transactions that were processed in that particular response time as the associated value. Therefore, if you have within one sampling period, 10,000 operations processed with 6,000 in 0 ms, 3,999 in 1ms and 1 in 15 ms, your hashtable will only have 3 entries as follows: [ 0 => 6000; 1=>3999; 15=>1 ]

This allows for a dramatic compression of the data compared to having a single line with etime for each operation, which would result in 10,000 lines of about 100 bytes.

What's more is that this representation of the same information allows to easily compute the average, extract the maximum value and calculate the standard deviation.

All that said, the weather station is only sent the last of the samples, so it always shows the current state of the server. And as it turns out, it is very useful, I like it very much just the way it worked out.

 Well, I'm glad to close down the shop, it's 7:30pm .... another busy Saturday

Friday Oct 02, 2009

Note To Self: Things To Do On A Vanilla System

<script type="text/javascript"> var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www."); document.write(unescape("%3Cscript src='" + gaJsHost + "' type='text/javascript'%3E%3C/script%3E")); </script> <script type="text/javascript"> try { var pageTracker = _gat._getTracker("UA-12162483-1"); pageTracker._trackPageview(); } catch(err) {}</script>

I just thought I'd make a note of the common things I do and funny enough, I think this blog might be the closest thing I have from a sticky note / persistent backup ... so here goes:

# enable power management
pfexec echo "S3-support    enable" >> /etc/power.conf
pfexec pmconfig
pfexec svcadm restart hal

# disable access time update on rpool to minimize disk writes
pfexec zfs set atime=off rpool

# get pkgutil to install community software
pfexec pkgadd -d`uname  -p`.pkg

# Add repositories to the package manager

pfexec pkg set-publisher -O dev
pfexec pkg set-publisher -O contrib
pfexec pkg set-publisher -O blastwave
pfexec pkg set-publisher -O sunfreeware

# download and install the flash plug-in for firefox
wget -O libfp.tar.bz2 --no-check-certificate
bunzip2 libfp.tar.bz2
tar xf libfp.tar
pfexec mv flash_player\*/ /usr/lib/firefox/plugins
rm libfp.tar
rmdir flash_player\*

# get perfbar
wget -O perfbar --no-check-certificate
chmod 755 perfbar
nohup ./perfbar &

# configure coreadm
coreadm -g /var/cores/%t-%f -e global

Quad Monitor With Rotation: Where There Is A Will, There Is A Way

<script type="text/javascript"> var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www."); document.write(unescape("%3Cscript src='" + gaJsHost + "' type='text/javascript'%3E%3C/script%3E")); </script> <script type="text/javascript"> try { var pageTracker = _gat._getTracker("UA-12162483-1"); pageTracker._trackPageview(); } catch(err) {}</script>

Talking with a friend recently, he told me about his miserable experience trying to get his workstation to work with four monitors.

Now, I was surprised at first because there are lots (ok, maybe not lots, but a sizeable number) of people with quad-head workstations out there, so obviously that seems rather doable. The trick in his case seemed to be heterogeneity: 2 different dual-head cards, and 4 different monitors of different brands and sizes. Additionally, he wanted one of his widescreens tilted in portrait mode for his coding. Nice for browsing as well, but he wanted to be able to have a tall IDE to see more code at once without the need to scroll.

It took me a while just to get the equipment but to find some spare time to this as well. I ended up with the following:

  1. a desktop that would lend itself to the experiment
  2. 4 dual head videos cards to test combinations
    1. nVidia GTX 280
    2. nVidia Quadro FX 380
    3. nVidia GeForce 9600 GT
    4. nVidia GTS 250
  3. 4 monitors
    1. Sun 24.1"
    2. Dell 22"
    3. Acer 24.3"
    4. Dell 20"
  4. a free Saturday (that was actually the most difficult component to find)

To cut short, the result is ... rolling drum ... it _can_ work once you know what to do and what not to. Here is the final result:

So how do we make that work? Well, first thing is NOT to desperately cling to TwinView. You have to let go of that, fall back on good ol' XineRama which does a fine job anyway.

As I said in my previous post, rotating the monitor is only a matter of adding Option "Rotate" "left" in the relevant screen section.

For all the X options explained, I found this quite useful. Dig in there.

What you want to be careful about:

  • if at first both cards are not recognized, worry not. Go to a terminal and issue the following command:

pfexec nvidia-xconfig -a

This will force the nvidia config utility to look across all cards.

Note that if this still doesn't work, issue:

pfexec scanpci

and write down the PCI id for each card. It is the first number right after the pci bus 0x002. In this example, this would translate into

BusID "PCI:2:0:0"

in the device section in xorg.conf

  • look at your /var/log/Xorg.0.log for errors
    • you will see something like

(II) LoadModule: "xtsol"
(WW) Warning, couldn't open module xtsol
(II) UnloadModule: "xtsol"
(II) Failed to load module "xtsol" (module does not exist, 0)

 Don't worry, that's a trusted solaris extension that is hardcoded to be loaded by X even when it's not a trusted solaris OS running, this has yet to be fixed. 

  •  make sure to enable Composite
  • make sure to enable GLX with composite
  • make sure to enable RandRRotation
  • Check /var/adm/messages for IRQ collisions which could result in some funky discrepancies. If you find any, tweak your BIOS to force each PCI slot to a distinct IRQ. The message would look similar to:

unix: [ID 954099] NOTICE: IRQ16 is being shared by drivers with different interrupt levels

All that said, here is an example of xorg.conf with a single monitor tilted, and everything working pretty well considering that nothing is matched. It does work but doesn't come for free as you can see. There is one drawback however, I have not been able to make Compiz work because apparently the cards would have to have an SLI link between them, but I haven't confirmed that for sure. That's it for today folks!

OpenSolaris 2010.02 on EeePC 1000: Out-Of-The-Box!!!

<script type="text/javascript"> var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www."); document.write(unescape("%3Cscript src='" + gaJsHost + "' type='text/javascript'%3E%3C/script%3E")); </script> <script type="text/javascript"> try { var pageTracker = _gat._getTracker("UA-12162483-1"); pageTracker._trackPageview(); } catch(err) {}</script>

As usual, I try to give as much away in my titles as I can. This one is no different: it just works....

With 2009.06, you needed to build your own drivers for ethernet and wifi. Pretty much a non starter for 99% of users, understandably so: when it just works for Linux and Windows, why sweat it on OpenSolaris ?

Now that dilemma is behind us: I installed an early access of 2010.02 (OpenSolaris b124) and when the installation was done, everything worked: a whole new  experience for me on OpenSolaris. I almost EXPECT to have to fiddle with a driver, a config file, an SMF service that doesn't start, ..., something.

In this case: nothing! Simultaneously gratifying and almost disappointing. I mean, even on my desktop OpenSolaris required some elbow grease to work  the way I wanted, but in this case, the coveted prize of a functional system would be handed to me without even the hint of a fight ? ... unusual, to say the least.

And that's good. I used to say that Solaris is the certainly best server OS and just as certainly the worst desktop OS, but this one shot has me wondering... maybe the Sun engineers have covered some of the ground that separates OpenSolaris from Linux. Granted, there's still ways to go! Yes the embedded 1.3 Mega Pixels webcam works but the quality of the picture is perfectible and I don't think it is the hardware... to be fair, Sun has to write their own drivers for everything so I'm even surprised it worked at all, so that pretty good!

Now there is on rather big bummer does suspend but doesn't resume. Pretty big issue for a laptop which is -because of its form factor- bound to be used on the go. If I can make it work, I will post here. If you have had success make resume work, drop me a line!

Lenovo W700ds dual monitor laptop: works! Another 2010.02 success

<script type="text/javascript"> var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www."); document.write(unescape("%3Cscript src='" + gaJsHost + "' type='text/javascript'%3E%3C/script%3E")); </script> <script type="text/javascript"> try { var pageTracker = _gat._getTracker("UA-12162483-1"); pageTracker._trackPageview(); } catch(err) {}</script>

OpenSolaris 2010.02 early access build 124 is really faring pretty well so far. It isn't free of issues, granted, but at the same time, it has improved leaps and bounds on laptop support, especially for netbooks, thanks to a passionate and dedicated team writing up a bunch of device drivers for wifi and network cards found in these little laptops.

Today, I installed build 124 on a Lenovo W700ds.

You probably have never heard of that beast because they probably only sold half a dozen of them, one of which landed on my desk yesterday. The main reason for this success is probably that it weighs a ton (11 lbs or 5 Kgs!!!) due in part to its main 17" monitor, doubled by a netbook-like 10" monitor that slides out from behind the main one.... here are the specs. Notice they call it "portable power". Trasnportable would be more accurate. After using this laptop for about an hour now (I'm writing this post on it), I do have to say that it is quite fantastically comfortable, just about as much a desktop would be...not really surprising if you consider it has a full size keyboard + numeric keypad.

 So, OpenSolaris installs without a glitch, once again the installer just does its job without whining. If you run the device driver utility it will notify you that two devices do not have a driver for solaris, one being the integrated bluetooth card and the other being the fingerprint reader. Not a big deal. Once OpenSolaris is installed, it will boot in Gnome just as on any other machine, but what you really want is the second monitor to work... and there's a trick to that.

 First, the second monitor won't be recognized if you don't pull it all the way out at boot time. Took me a while to figure this one out. To save some mW, the Lenovo folks don't power it unless it's out and that makes it undetectable at first.

Second, once recognized by X, it will actually display sideways. This "companion" display is actually is 16:9 10" netbook display tilted right so that it's width resolution (1280x768) almost matches the height resolution of the main display (1920x1200). So all we have to do is to tilt it "left" to compensate for the hardware arrangement. To do so, simply enable the Rotate and Resize option on the graphics card and then tell X to rotate the appropriate screen left. Here's how:

Section "Monitor"

    # HorizSync source: edid, VertRefresh source: edid
    Identifier     "SlideOut"
    VendorName     "Lenovo"
    ModelName      "LEN 2nd Display"
    HorizSync       30.0 - 75.0
    VertRefresh     60.0
    Option         "DPMS"
    Option         "Rotate" "left"

Section "Device"
    Identifier     "Device1"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BoardName      "Quadro FX 2700M"
    BusID          "PCI:1:0:0"
    Screen          1
    Option       "RandRRotation"    "on"

 Note that TwinView must be disabled because twinview aggregates both display into a single block. Rotation with twinview on will result in rotate both displays. So you need to make them two X displays and enable xinerama.

here is the final xorg.conf in case you're interested...

Additional notes:

Suspend/Resume works great with this laptop -most of the time- however, it seems that sometimes, you will lose the second display upon resume, I'm not sure why.


Directory Services Tutorials, Utilities, Tips and Tricks


« October 2009 »