The Importance of Archiving (For the Rest of us)

A few weeks ago I was on the road with Dave Cavena. He works as an SE in Hollywood and helps our customers there understand the importance of digitally archiving their movies. The issue here is a simple one: Today, movies are being archived by storing rolls of film on shelves in gigantic warehouses and hoping they'll survive for a few years to come. "Few" could be tens or maybe hundreds of years, but nobody really knows how long they'll really survive and how good the movies will look after a couple of decades of archiving. Will the colours look natural? Will there be scratches? Will the film material degrade so that the movie rips right in the middle of the most important scene? Or will it spontaneously decompose into a heap of dust when someone opens the door after 150 years to see what the heck people were keeping in that warehouse anyway?

Digital Data on the other hand can be kept indefinitely and it can stay perfect through eternity, if people store it right. "Right" means things like keeping redundant copies in geographically distant places (so that the movie survices a warehouse fire or an earthquake), periodical integrity checking and fault healing based on those redundant copies (so silent data corruption can be detected and corrected) and periodic copy and conversion cycles so the data can survive format and media evolution. Try playing "Dragon's Lair", a classic arcade game from the 80ies which was originally produced for Laserdisc-based arcade machines. I loved the game back them and I was glad I found it on DVD. Now look it up on Amazon: You'll find Blu-Ray and HD-DVD versions as well!

Dave and some other very bright people have written an interesting white paper on "Archiving Movies in a Digital World". It is a great read: It shows why archiving movies the digital way is so important (so they can't get lost), how to do it and why this is actually cheaper than keeping rolls of film in warehouses (Hint: Archiving bits takes up less real estate and looks a lot cooler if you use one of these. Your archives may even become smart by using one of these, too!).

That got me thinking: What will happen to all those photos that people are taking using their private digital camera? Parties, Vacations, Babies, Families, etc.? Yes, if people don't start thinking about a good archiving strategy soon, they will all be lost in the next couple of decades. By the time my little daughter gets married, I might have lost all of her baby pictures if I don't do something real quick (as in: This decade).

Storing them on a file server running a serious, enterprise-class OS with ZFS on a set of redundant storage media (could be disks, could be more disks, iSCSI devices, could be USB sticks, it really doesn't matter) is a good start, because it can provide 100% data integrity and self-heal any damages before they become permanent. But this is still not enough. They need to be stored in multiple locations and they need to be periodically copied into more recent media.

I'll figure out the multiple locations part someday (probably a second server that replicates the first file server's ZFS file systems through a couple of zfs send/receive scripts) and the periodical copying means I'll still have an interesting hobby when my daughter has children of her own. Meanwhile, just to be sure, I've started to copy all of my photos to a popular photo-sharing service called Flickr on the net. Yes, all of them. This means I can still decide who can look at my pictures and who not, I get access to all of my pictures from everywhere near a web browser and I can store as many photos as I want for just a small annual fee. And they'll still be there should my basement catch fire or should all of my disks die for some strange statistical reason or when I start taking 3D photographs of my grandchidren. What more could anybody want?

Now what if the movie industry found out about this and started archiving their movies on Flickr as well, image after image, all of them, for just the same small annual fee?

Update (Sep. 18th, 2007): Thanks to Jesse's comment, I've now tried SmugMug and I love it! They offer a 50% discount for Flickr refugees during the first year and there's a nice tool called SmuggLr that makes migration a snap. Thank you Jesse!


I do the same thing for my photos, I have a OpenSolaris/ZFS server and upload my photos to SmugMug (which I much prefer over Flickr).

Video on the other hand is a different story. I have just started archiving my video, and the raw DV at 25mbps takes up a LOT of space. This is fine for my ZFS server (5x500gb raidz) but uploading and storing it off site is a major problem.

Posted by Jesse DeFer on September 18, 2007 at 01:44 AM CEST #

Hi Jesse! Yep, you're right. My wife has a Macbook and she started editing DV video in iMovie. I do all of our bakups into my ZFS file server and I urgently need to back up her machine but first I should check how much space I have left in my pool :).

Posted by Constantin on September 18, 2007 at 02:50 AM CEST #

The content itself is very valuable, but of greater importance for my personal content (family pictures, miniDV videos, etc) is the METAdata.

A directory (even with spiffy thumbnail views) with thousands of DSCN1054.jpg files, with limited access to the EXIF data in them, is just as useless for history as the best-preserved pile of physical photos in a shoebox.

But tie in metadata (who's in the picture, when/where was it taken, etc) and this digital archive can become meaningful.

Are there any tools for managing/integrating this metadata?


Posted by Joe Moore on September 19, 2007 at 02:05 PM CEST #

Hi Joe,

the standard for metadata in photos is EXIF (See: Your digital camera today uses EXIF to store thumbnails, when (and sometimes where) the photo was taken and some technical data (exposure etc.). Ideally, a photo management application would use only EXIF to store all of a photo's metadata. This would have the advantage of the metadata being stuck with the photo and therefore hard to lose.

However, this may or may not be the case with your particular brand of photo management software (I use iPhoto and I don't know what it uses...).

When storing photos with a web based photo service, the problem becomes theirs and not yours :). These services typically come with an API that lets you access photos and metadata, so it will always be accessible in a standardized fashion (assuming that the photo service does their homework). I just migrated my photos from Flickr to SmugMug using a Firefox Pluging and it went smooth and it preserved all the metadata.

Maybe not a 100% answer, but keeping an eye on EXIF when managing photos seems to be a safe bet for now.


Posted by Constantin on September 20, 2007 at 02:31 AM CEST #

For anyone interested in SAM-FS (which sits at the heart of the Movie Archiving solution) I blog about it at I recently wrote an overview of how SAM-FS works and I just completed a two month project working with SAM-FS on the Sun Fire X4500 which I will be blogging about soon...there is also a Sun internal paper about this work which I hope will be published outside the firewall sometime in the future..

Posted by Tim Thomas on September 26, 2007 at 04:14 AM CEST #

Post a Comment:
Comments are closed for this entry.

Tune in and find out useful stuff about Sun Solaris, CPU and System Technology, Web 2.0 - and have a little fun, too!


« August 2016