Sunday Mar 15, 2009

One and a Half Million Seconds Off the Grid

My previous post described an initial experiment on collecting data from a solar-powered Sun SPOT. That experiment helped uncover and fix several bugs that caused disruptions in data collection. I repeated the experiment again last month using the red-090113 release of the Sun SPOT SDK to test those fixes.

This time around, the SPOT stayed off the power grid from Feb 12 until Mar 3 -- slightly over 19 days (that's more than 1.6 million seconds) -- reporting light, temperature and other readings every 10 minutes. Unlike last time, there were no occasions when the SPOT failed to enter deep sleep. There was just one disruption and it was caused by a mySQL table crash (still not sure about the cause), but the SPOT stayed up throughout. Its reported up time as well as the time spent in deep and shallow sleep kept rising at a regular clip throughout the experiment (see plot below). Overall, the SPOT spent nearly 95% of the total time in deep sleep, around 4.5% in shallow sleep and about 0.5% in active computation. The large gap from late Feb 22 to noon Feb 23 marks the disruption due to the table crash.

It rained frequently during this experiment (click here for the local weather and look in the "Events" column under "Observations"). The SPOT did not have any special weather-proof casing so I kept it indoors -- by the window in my office -- for the most part and only occasionally took it outdoors when dry. Direct Sun light recharged the SPOT's battery more efficiently. The second plot shows variations in the USB voltage supplied by the solar panel (in light blue), the output voltage measured at the SPOT's battery (in dark blue) and the estimated remaining battery level (in red). Spikes in the light blue plot indicate periods when the solar panel circuitry charged the SPOT's battery and caused an increase in its remaining capacity (red).

The normal discharge rate for the SPOT's built-in battery (rated at 700mAh) was nearly 12% per day and a full charge would have lasted about 8 days. However, the solar panel was able to replenish about 15-18% of the battery whenever it was sunny and I suspect that this figure could be bumped up to 25-30% with some simple optimization of its orientation.

The next plots show changes in light and temperature readings as measured by the built-in sensors on the eDemoBoard. The significant dip in temperature on the early morning of Feb 20 marks the only night the SPOT was left outdoors.

We collected 2559 samples during the experiment when we should have collected 2747; 82 were lost due to the table crash and the rest (106 or 4% of the total) can be attributed to the unreliable nature of the SPOT's UDP-like radiogram communication mechanism. The distribution of lost samples is shown below and does not include losses due to the table crash. It would be interesting to study whether packet loss is correlated to humidity.

The SPOT still had about 10% of its battery remaining when the experiment ended. So what terminated the experiment? Accidental exposure to Dihydrogen Monoxide aka water! On the evening of Mar 3, the SPOT was outdoors when it started to drizzle. I was attending a meeting in a windowless conference room at that time. By the time I became aware of the change in weather outside, it was too late. The rain had caused the SPOT to stop working (see picture below).

Picture of a SPOT with rain drops.

I was able to revive the main board subsequently simply by letting it dry out. The eDemoBoard, however, appears to have been damaged. Nevertheless, it was gratifying to note that it was human error, rather than a software or hardware bug, that brought the experiment to an end this time around.

NOTE: The plots above were generated using Gnuplot. An interactive, and way cooler, version (based on Simile timeplot from MIT) can be accessed here. One can look at individual sample readings by moving the mouse over these "live" plots.

Saturday Feb 28, 2009

Experiments with a Solar-powered Sun SPOT

Environmental monitoring is proving to be a popular application for Sun SPOTs (see here, here and here). This and other similar applications require a Sun SPOT device to operate for long periods (months) using a combination of renewable energy sources (e.g. a solar panel) and duty cycling -- having the device wake up only occasionally to record and/or transmit sensor readings and sleeping for the most part.

A few months ago, I conducted an experiment that collected sensor readings from a solar-powered SPOT into a mySQL database for almost four weeks. A write-up describing the results is now available as a Sun Labs Technical Report and featured in this week's spotlight on the Labs' home page.

This experiment helped us uncover and fix several issues that caused disruptions in data collection -- the occasional inability of the device to enter deep sleep, the resulting clock reset due to premature battery exhaustion, and loss of connectivity to the database after long periods of inactivity. The report offers important lessons in the design of sensor data collection frameworks and lists both recommended best practices and potential pitfalls to avoid.

As I type this, another Solar-powered SPOT running a new version of our software has been collecting and reporting sensor readings. It has already been up for more than two weeks without any of the disruptions we saw previously leading me to believe that the fixes we incorporated in response to lessons learnt are working well. Watch this space for a follow-on post describing the latest experiment.

Friday Dec 19, 2008

Epidemic Code Deployment on Sun SPOTs: Take One

Today, the only way to deploy the same code to multiple SPOTs is one-at-a- time -- the user "connects" each SPOT to a host computer containing the application (either via USB or via a radiostream over-the-air) and invokes ant deploy. Epidemic code deployment is a more efficient mechanism that lets code propagate from SPOT to SPOT. Imagine several dozen SPOTs, spread over a large area in a battlefield, in need of a new software update. With epidemic code deployment, one could upgrade the software on a single SPOT (potentially, in a more peaceful, safer locale), air-drop it, and, after some time, "automagically" have the new code propagated to other SPOTs.

This sort of functionality is available in TinyOS but certain aspects of the SPOT software rule out a straight port of that code. Unlike more constrained devices, Sun SPOTs support multiple application slots and use digital signatures (based on elliptic curve public-key cryptography) to guarantee code authenticity. The private key used for code signing never leaves the host computer. The SPOTs only store the corresponding public key used for signature verification so, even after compromising a SPOT, an attacker does not gain the ability to run malicious code on other SPOTs. We'd like to retain these advantages while adding support for epidemic deployment.

In our existing suite deployment process, the host computer first queries the SPOT for an empty application slot and, based on the response, remaps all pointers in the suite. This remapping (also called suite relocation) is slot-specific -- a relocated suite cannot be sent from SPOT-to-SPOT unless the destination SPOT plans to put it in the same slot as the sender. So the first step towards supporting epidemic code deployment is moving the suite relocation process from the host to the SPOT.

Our use of standard digital signatures precludes pipelined forwarding of code from SPOT-to-SPOT -- a SPOT must receive a suite in its entirety before the signature can be verified. In fact, with the existing scheme, a suite isn't verified until the frst time it is about to be executed by the Squawk virtual machine. The use of signed hash chains gets around this problem. Here's a brief description of how this would work:

  • The unrelocated suite is divided into fixed size chunks p0, p1, ... pn-1.
  • For 0 <= i < n-1, chunk pi-1 carries with it a hash hi such that hi = Hash(pi | hi+1). Here | is the concatenation operator and H is a cryptographically strong hash, e.g., SHA.
  • Meta information, m (e.g., suite size), is sent with h0 and a signature computed over m and h0.
    Image showing Hash chain cnstruction
  • The receiving SPOT verifies the signature on the meta-information and a successful check guarantees the authenticity of its sender.
  • As each subsequent chunk is received, the recipient computes its hash and compares it against the expected value received with the previous chunk. A successful match indicates that the new chunk was sent by the same sender that sent the previous chunk.
  • Since the authenticity of each chunk can be established as it arrives, not only can the chunk be relocated and written to flash immediately, it can also be forwarded to other SPOTs down the line with full assurance.

Replacing standard signatures with signed hash chains is, therefore, the second critical step towards supporting epidemic deployment.

I'm pleased to announce the release of a patch that implements both these features. It is largely the work of Robert Taylor, a Ph.D. student at the Univ of Manchester who interned with us. Much work still remains in order to support full epidemic code deployment. For example, this patch only takes care of verifying chunks as they come in but still waits until the entire suite is received before performing relocation. The unrelocated suite must be held in memory -- a problem for large suites like the library. Hence, this patch applies the modified deployment scheme only to application suites which are typically much smaller. The start of a suite contains all the information needed for relocation so incorporating this capability is simply a matter of writing additional code. When another SPOT down the line requests new code, an already updated SPOT must be able to recreate and send the same stream that first originated at the host computer. This involves undoing the relocation operation and interleaving the unrelocated suite with the appropriate hashes. So the hash chain must also be stored in Flash along with the suite -- something the current patch doesn't do. Lastly, this version of the patch still deploys code from a host computer to a SPOT (one-SPOT-at-a-time). Instead, what's needed is a protocol for propagating the code stream from SPOT-to-SPOT in a pipelined fashion (e.g. via a cascading broadcast).

Nevertheless, this patch represents a significant step towards full epidemic deployment. Several members of the Sun SPOT community have already started thinking about this issue (see here and here) so we decided to get the patch out now even though it isn't as clean as I'd like it to be. If you try out the patch (see instructions in the README.txt file), I'd love to get your feedback.


Vipul Gupta


« June 2016