Epidemic Code Deployment on Sun SPOTs: Take One
By Vipul Gupta on Dec 19, 2008
Today, the only way to deploy the same code to multiple SPOTs is one-at-a- time -- the user "connects" each SPOT to a host computer containing the application (either via USB or via a radiostream over-the-air) and invokes ant deploy. Epidemic code deployment is a more efficient mechanism that lets code propagate from SPOT to SPOT. Imagine several dozen SPOTs, spread over a large area in a battlefield, in need of a new software update. With epidemic code deployment, one could upgrade the software on a single SPOT (potentially, in a more peaceful, safer locale), air-drop it, and, after some time, "automagically" have the new code propagated to other SPOTs.
This sort of functionality is available in TinyOS but certain aspects of the SPOT software rule out a straight port of that code. Unlike more constrained devices, Sun SPOTs support multiple application slots and use digital signatures (based on elliptic curve public-key cryptography) to guarantee code authenticity. The private key used for code signing never leaves the host computer. The SPOTs only store the corresponding public key used for signature verification so, even after compromising a SPOT, an attacker does not gain the ability to run malicious code on other SPOTs. We'd like to retain these advantages while adding support for epidemic deployment.
In our existing suite deployment process, the host computer first queries the SPOT for an empty application slot and, based on the response, remaps all pointers in the suite. This remapping (also called suite relocation) is slot-specific -- a relocated suite cannot be sent from SPOT-to-SPOT unless the destination SPOT plans to put it in the same slot as the sender. So the first step towards supporting epidemic code deployment is moving the suite relocation process from the host to the SPOT.
Our use of standard digital signatures precludes pipelined forwarding of code from SPOT-to-SPOT -- a SPOT must receive a suite in its entirety before the signature can be verified. In fact, with the existing scheme, a suite isn't verified until the frst time it is about to be executed by the Squawk virtual machine. The use of signed hash chains gets around this problem. Here's a brief description of how this would work:
- The unrelocated suite is divided into fixed size chunks p0, p1, ... pn-1.
- For 0 <= i < n-1, chunk pi-1 carries with it a hash hi such that hi = Hash(pi | hi+1). Here | is the concatenation operator and H is a cryptographically strong hash, e.g., SHA.
Meta information, m (e.g., suite size), is sent with h0 and a signature computed over m and h0.
- The receiving SPOT verifies the signature on the meta-information and a successful check guarantees the authenticity of its sender.
- As each subsequent chunk is received, the recipient computes its hash and compares it against the expected value received with the previous chunk. A successful match indicates that the new chunk was sent by the same sender that sent the previous chunk.
- Since the authenticity of each chunk can be established as it arrives, not only can the chunk be relocated and written to flash immediately, it can also be forwarded to other SPOTs down the line with full assurance.
Replacing standard signatures with signed hash chains is, therefore, the second critical step towards supporting epidemic deployment.
I'm pleased to announce the release of a patch that implements both these features. It is largely the work of Robert Taylor, a Ph.D. student at the Univ of Manchester who interned with us. Much work still remains in order to support full epidemic code deployment. For example, this patch only takes care of verifying chunks as they come in but still waits until the entire suite is received before performing relocation. The unrelocated suite must be held in memory -- a problem for large suites like the library. Hence, this patch applies the modified deployment scheme only to application suites which are typically much smaller. The start of a suite contains all the information needed for relocation so incorporating this capability is simply a matter of writing additional code. When another SPOT down the line requests new code, an already updated SPOT must be able to recreate and send the same stream that first originated at the host computer. This involves undoing the relocation operation and interleaving the unrelocated suite with the appropriate hashes. So the hash chain must also be stored in Flash along with the suite -- something the current patch doesn't do. Lastly, this version of the patch still deploys code from a host computer to a SPOT (one-SPOT-at-a-time). Instead, what's needed is a protocol for propagating the code stream from SPOT-to-SPOT in a pipelined fashion (e.g. via a cascading broadcast).
Nevertheless, this patch represents a significant step towards full epidemic deployment. Several members of the Sun SPOT community have already started thinking about this issue (see here and here) so we decided to get the patch out now even though it isn't as clean as I'd like it to be. If you try out the patch (see instructions in the README.txt file), I'd love to get your feedback.