Sunday Mar 02, 2008

Flirtin' With Disaster

I spoke on a panel at a Marcus-Evans conference on business continuity and disaster recovery and found myself in the position of converging three themes: business continuity, security, and virtualization. Of course, I had my eyes (mostly) open while speaking, although it appears I was out of focus for much of the session so this shot of me escaping the bull by the horns has to suffice, or at least detract from my weak Molly Hatchet references (speakers on if you click on the link).

Historically, data center management saw disaster recovery and business continuity as reactions to physical events: force majeure of nature, building inaccessibility, threats or acts of terror, or major infrastructure failures like network outages or power blackouts. Increased stress on the power grid and near-continous construction in major cities increases the risks of the last two, but they are still somewhat contained, physical events that prompt physical reactions: spin up the redundant data center, fail over the services, and get up and running again, ideally without having missed shipments, deliveries or customer interactions.

Business continuity today has those physical events as table stakes only. The larger, more difficult problems are network service failure (due to denial of service attacks or failure of a dependent service), geographic restriction (due to pandemic fears, public transportation failures, or risk management), data disclosure and privacy, and the overall effect on brand and customer retention. What if you can't get people into an office, or have to react to an application failiure that results in customer, partner or supposedly anonymous visitor information being disclosed? Welcome to this decade's disasters.

Where does virtualization fit in? Quite well, actually. Virtualization is a form of abstraction; it "hides" what's under the layer addressed by the operating system (in terms of a hypervisor) or the language virtual machine (in the case of an interpreted language). But it's critical that virtualization be used as a tool to truly drive location and network transparency, not just spin up more operating system copies. I never worry about the actual data center containing my mail server, because I only see it through an IMAP abstraction. It could move, failover, or even switch servers, operating systems and IMAP implementations, and I'd never know. Virtualization gains in importance for business continuity because it drives the discussion of abstraction: what services are seen by what users, where and how on which networks?

Bottom line: Business continuity planning shares several common themes with systemic security design. There's self-preservation, the notion that a system should survive after one or more failures, even if they are coincident or nested. The least privilege design philosophy ensures that each process, administrator or user is given the minimum span of control to perform a task; in security this limits root privileges while in BC planning it ensures that you don't give conflicting directions regarding alternate data center locations. Compartmentalization drives isolation of systems that may fail in dependent ways, and helps prevent cascaded failures, and proportionality helps guide investment into areas where there is perceived risk. The short form of proportionality is to not spend money on rapid recovery from risks that would have other, far-reaching effects on your business anyway. My co-author Evan Marcus used to joke that it was silly to build a data center recovery plan for a potential Godzilla attack, because if that happens we have other, larger issues to deal with. On the other hand, if you saw Cloverfield, there's a lot of infrastructure that people depend upon even when monsters are eating Manhattan.

The best planning is to write out a narrative of what would happen should your business continuity plan go into effect: script out the disaster or event that causes your company to act, and write up press releases, decision making scenarios, and some plausible risk-adjusted actions, and follow the actions out to their conclusion. If you don't have a prescribed meeting place for a building evacuation, and there's no system for employees to check in and validate their safety, then your business continuity plan may suffer when you have to scramble to find a critical employee. When disasters happen, the entire electronic and physical infrastructures are unusually stressed, and normal chains of communication break down. Without a narrative to put issues into perspective, your disaster planning document becomes a write-only memory, holding little interest or failing to gain enough inspection from key stakeholders and contributors. Start naming names, and putting brand, product and individual risks into black and white, and you'll see how your carbon-based networks hold up when the fiber and copper ones are under duress.

Thursday May 17, 2007

ACM Queue Interview with Cory Doctorow online

The ACM Queue interview with Cory Doctorow in which we covered privacy, security, trust, telemetry and the seedier side of content filtering is now available online.

There are also a nice lead-in and some good words from Cory Doctorow on on, his personal (and frequently funny) website.

Saturday May 05, 2007

ACM Queue interview with Cory Doctorow

Cory Doctorow and I got to sit down back in February for a long chat about privacy, security, device telemetry, emulation, stimulation and who inflicts (or tries to) policy, control and editorial stance on what you see online. It was a fun romp through the strange intersection of science, science fiction and digital rights. The article is in the current print edition of the magazine and available online to ACM subscribers. (Don't ask me what Cory said that had me doubled over in laughter in one of the pictures; I was smiling most of the time anyway).

Cory makes a reference to Natalie Jeremijenko, who gets a cover shot and a one-pager in the May/June issue of Good magazine. Add green thinking about security, privacy and security to that mix.

Friday Jan 05, 2007

Security as a Thing, Not a Place

Part number next in our continuing Innovating@Sun audio-fest features Distinguished Engineer and Jersey guy Glenn Brunette talking about systemic security.

Typically we think about securing things -- a system, a network, our homes, sometimes even our personal content like email or files on a laptop. That's security as a place, a well-drawn boundary around what is secured and what lies at risk. Security, however, is itself a thing, something that evolves as risks, components and the relationships of those components change, and as new risks are discovered and tolerances for those risks tested and cost-adjusted. Glenn's systemic security patterns reflect outstanding systems engineering -- thinking about security in a dynamic context, not a set-it-forget-it mentality.


Hal Stern's thoughts on software, services, cloud computing, security, privacy, and data management


« June 2016