"Password-protected, but not encrypted"

While I'm happy that my own details haven't been leaked as part of the HMRC data leak (not having children is good for my privacy as well as my bank balance and my carbon footprint, it would seem), I'm following the news closely, as more information about the leak is disclosed.

The Chancellor of the Exchequer was interviewed on Radio 4's "Today" programme on Tuesday morning last week, and said something which particuarly surprised me. Specifically, he referred to the way in which the data was stored on the missing discs as "password-protected, but not encrypted".

I conjecture that you can't actually have password protection, without encryption.

Consider one of these missing disks. If it was to turn up and you put in your DVD-ROM drive, you could dd the blocks off it, to get yourself a file of anything up to 4-and-a-bit GB in size. If you then grep through it for known cleartext (such as names of folk you know, who are parents) you'll get matches unless the data is either compressed or encrypted; it's fair to assume that the files on the disks will have been generated by fresh extraction from a database of some sort, so you're going to be looking at a reasonably sequential set of blocks, without much fragmentation or indirection.

This neatly bypasses any application-layer password system.

If the files on the disks are simply compressed, you could either reconstruct the compressed data sets from the dd'ed blocks using forensic tools, or simply mount the disks, copy the files to scratch space and decompress them.

Here's where you're likely to hit password protection - at the application layer.

Thinking about what is likely to have been done when marshalling the files to burn onto the disks, it's rather probably that whatever raw data required, was put into a password-protected zip archive (in fact, http://news.bbc.co.uk/1/hi/uk_politics/7106987.stm suggests this is the case).

The zip compression standard indicates that, where password protection is applied, the password is used to unlock a soft keystore from which a symmetric key is extracted, and that key then decrypts the main body of the archive before the usual decompression takes place.

Please note the use of the word "decrypts", Chancellor :-).

Apparently, WinZip 9.x introduced AES encryption, so depending on what version of what zipping app is in use at HMRC, it may even be using a US-formally-approved algorithm.

Granted, the soft keystore needs to be bound up with the data in the file (and it's usually advisable to keep your keys somewhere where your data isn't), but encryption is still encryption. However, for earlier versions of Zip, I'm reliably informed that the PC1 encryption algorithm it uses, is rather straightforward to break.

It's also possible that, rather than password-protect a zip archive, HMRC sent the data in some password-protected spreadsheet form; let's look at what happens with StarOffice Spreadsheet and Microsoft Excel, in this regard...

From the OASIS standard for ODF 1.0...

The encryption process takes place in the following multiple stages:

1. A 20-byte SHA1 digest of the user entered password is created and passed to the package component.

2. The package component initializes a random number generator with the current time.

3. The random number generator is used to generate a random 8-byte initialization vector and 16-byte salt for each file.

4. This salt is used together with the 20-byte SHA1 digest of the password to derive a unique 128-bit key for each file. The algorithm used to derive the key is PBKDF2 using HMAC-SHA- 1 (see [RFC2898]) with an iteration count of 1024.

5. The derived key is used together with the initialization vector to encrypt the file using the Blowfish algorithm in cipher-feedback (CFB) mode.

...nice :-).

For Excel, here's the appropriate quote directly from Microsoft's support site:

"You can use a strong password with the Password to Open feature in conjunction with RC4 level advanced encryption to require a user to enter a password to open an Office file."

Not as explicitly defined as the ODF standard, but then, that's Microsoft for you.

Nonetheless, RC4, if correctly implemented, is Plenty Good Enough to count as "encryption".

Of course, if the HMRC infrastructure had been built on top of Trusted Extensions, the "junior employee" (noting the rumours forming, that more senior staff may have been complicit) would probably not have had the label at which all this data was stored within his clearance range, or the clearance range of a role that he was allowed to assume without passing through a two-person rule; he certainly wouldn't have had the privilege to mount or burn media at that label...


Actually, it looks like "password protction without encryption" has been implemented as a feature, as "Password to modify" in Microsoft Office - but, as you might expect, it doesn't work...


Robin's Law:

"Those who understand encryption and those who make policy decisions about its use are two sets which do not intersect".


1 - if you line up the right people in between the two sets, you can just about make them orthogonal;

2 - in doing so, you probably introduce "Chinese Whispers" to a fatal degree.

Posted by Robin Wilton on November 26, 2007 at 06:25 AM GMT #

I happen to know it was Winzip 8 that was used

Posted by guest on December 11, 2007 at 04:52 AM GMT #

To the poster of the last comment, about Winzip 8: that's certainly interesting. If there are any further technical details you believe you can discuss, I'd be happy to talk to you. Non-disclosure agreements can potentially be arranged.

Posted by Dave Walker on December 11, 2007 at 05:58 AM GMT #

It's no big secret. It's mentioned in the Treasury Select Committee evidence on 5 December. But it's not on Hansard yet so no-one's noticed.

Posted by guest on December 11, 2007 at 08:53 AM GMT #

Oh my mistake. It's q 389 on the link provided.


Posted by wigwam on December 12, 2007 at 06:30 AM GMT #

Post a Comment:
  • HTML Syntax: NOT allowed



« July 2016