Metadata, CAS and the future of Digital Photos

I have my new Adobe Photoshop Elements 6.0 up in the background and I am currently working away at tagging all of my pictures. Get this, I have 8,477 digital photos since I started taking them around 1997 or so with my Kodak DC120.

Here is a picture from the Kodak, I was trying to figure out how well the camera processed different color bands.

I have a few scanned photos from before that and, in fact, I went on a scanning binge this Summer with old pictures my Mom had sent me...of course, I have lost those already and have to re-scan them.

My newer Canon Rebel XTi is very cool. I can't say enough about it. Finally a camera that I feel replaced my 35mm (Canon Rebel as well). I have a few lenses for it and am still practicing like crazy. My friend bought me a kick-butt lens for it, 67mm front opening and a fixed f/2.8 aperture. Here is one of my favorite "technical" pictures so far (not to say I don't have a growing library of pictures that I love too...more on that in a minute).

Back when I used my Kodak, vendors hadn't realized the importance of a non-proprietary format. The Kodak recorded images in a ".mix" format. These days its hard to find software that recognizes those images, .mix has not stood the test of time. Generally, vendors have learned from their lessons. Oh, wait...more advanced cameras allow you to take pictures in a RAW image format, this is an unprocessed, uncompressed image that is basically like a digital negative. RAW formats generally suffer the same limitations as those early picture formats, you are often tied to the manufacturer's software to edit the RAW images, then convert them to something like a JPG.

Today was the first real sled day here in Colorado (at least in the Denver-metro area). I tromped out with my camera and kid in tow and snapped a few shots. When I say a few shots, I mean about 150. Here is one from the zoom lens.

Now, let's talk storage. I wouldn't even rate myself an "amateur photographer", I am more like a "every few weekend" photographer. Still, I have racked up 1,000s of pictures, and the flood-gates have opened with my new camera. I have about 40GB of photos today and the challenge of maintaining them is growing. Here are my current set of challenges:

  • Backup

  • Search and Retrieve

  • Picture Format Compatibility

  • Printing

There are more challenges, of taking a good picture...but let's stick to the above. Reading a book like Photographing the Landscape: The Art of Seeing by John Fielder makes you realize what an art a camera really is.

So, here is how I am tackling my challenges with some technology tips for storage and folks like me...please feel free to let me know if you have additional tips, I can always use pointers.

Simply put DVDs don't work. I just don't like fiddling around with everything. So here is my current home-grown process.

  • Download pictures to my laptop
  • Take a pass at processing the photos (initial edits, ratings, etc...)
  • Convert favorites from RAW to JPG (to share)
  • Copy all of the pictures to a second computer

Then, every couple of weeks I copy the photos again using the Areca Backup Software out on SourceForge.

Now, I haven't experimented with it yet, but Adobe Photoshop Elements 6.0 finally has an online backup facility built in. Trust me, we are all going here sooner or later. Its crazy having all of your memories on a hard drive. And, further, if you think you will have to leave your house in a hurry some day (just ask folks in San Diego about this), you better have them on your laptop or on one of the new external drives. Semi-professional and professional photographers even build their own storage networks with JBODs and systems as storage heads. The The DAM Book: Digital Asset Management for Photographers is a great reference if you want to see an outline of putting together your own system for archiving.

Search and Retrieve
Search and retrieve is becoming my biggest headache. Thankfully, I was reading The DAM Book... as I started encountering this headache. Basically, your memory will fail and you will start losing track of your most treasured photos.

Here is the key: metadata, metadata, metadata.

Adobe Photoshop Elements has been worth every penny I spent. I rate the pictures and add tags to all of my pictures that will help me locate them in the future. I use top-level classes of People, Places, Events, Sports and Other. The Places are cool as there is built-in geo-tagging so that all of your pictures for a particular place can be found on a map (as long as you are Internet connected).

The DAM Book had additional guidance for tagging that is very useful (like how to effectively use the \* ratings). Now rather than searching around folders that I think were around the date that I took a picture, I simply click on the tags, put in date limits and figure out the quality of picture I'm looking for, a nice spread of pictures is displayed.

The investment in tagging my old pictures has been heavy and I'm still not done, but metadata is the ONLY way to manage your pictures. In my opinion, the XAM initiative is going to be huge, it basically makes this type of metadata search and retrieve the "top-level" storage primitive. Today, my metadata is heavily tied to Adobe Photoshop and the application. With XAM, applications use standard APIs for querying objects rather than files. My prediction...XAM or something like it will be a part of every major operating system within 5 years, making the storage primitive that of a Content Addressable Storage (CAS) device. With CAS/XAM type APIs, applications should be able to share the metadata and content rather than having it tied to proprietary "sidecar" files. Further, a good XAM implementation should let you use the information as standard files if you are really tied to your file explorer.

Picture Format Compatibility
My Canon's RAW format (.cs2) is thankfully compatible with Adobe Photoshop Elements. So I can edit my RAW files directly from the application. There is only one set of filters I'm missing which would be nice to have, but I'll survive.

But here's the thing, there is an open format from Adobe that is extremely interesting, it is the Digital Negative (DNG) format. It is sort of like a TIFF file, but open and with much more information packed in. The files end up about double the size of your original RAW file, but for a storage archive, DNG files make a lot of sense. The contents are still "RAW" but have been moved to a standard format that can withstand the lifetime of a digital camera (RAW files have no guarantee to live past the life of a particular camera or brand, thus giving you a potential headache when you switch cameras). DNG is not ubiquitous yet, so often times you cannot read the DNG file from software (other than Adobe's).

Still, my $$$ are on DNG to be ubiquitous in the coming years. I have not converted all of my files yet, but the day is coming...once I have them all tagged ;-)

Finally, printing. Let me give you a tip. Use your photo printer sparingly, it is simply not worth it. When you add up the cost of the printer, ink, and the headaches and combine it with the limited options of your home photo printer...just forget it. I can get $0.10 prints from my local grocery store now. I literally upload them, tell them to print them, and I can go pick them up in an hour. Its amazing. The quality at the grocery store was "OK", not exceptional...but your local photography store allows you to do the same thing. I print at a photo store by the mall for some even better prints...they are about $0.20 a piece for a 4x6. Kodak's online site kicks butt too, those come in the mail in a couple of days and the quality is incredible.

But here's a tip I learned the other week. The image sensors on the Canon's (and other dSLRs) blow up at a different ratio than a pocket camera. The dSLRs have an 8x12 ratio with no crop whereas the pocket cameras have an 8x10 ratio. I have only found one site that does 8x12s, SmugMug. The pictures arrived after a couple of days professionally packaged and looking exceptional. The price was pretty good too...there is a fee to be a member though whereas Kodak is pretty much free if you order something once a year.

And that, my friends, is a blog post that is way too long but has been eating me up for a long time.



Hi Paul - yeah it was kinda long, but it went down easy. Good info on backup, I've been looking for a good backup solution. I wasn't clear though, are you backing up to an external drive? Or exactly what do you dump to?

Sharing standard metadata would make a lot of things nicer. I'm posting and tagging on Flickr but it would be nice if I could then download that metadata to my own PC to duplicate. And then of course a peer-to-peer notification mechanism so that all such metadata "nodes" stay in sync.

Meanwhile I'll watch what happens with XAM...

Posted by Gary on December 09, 2007 at 11:36 PM MST #

Keep in mind, I am a complete nerd. I have my laptop that I use as my photography "base station", everything is on it. Its not a great laptop, its the Gateway I mention here on the blog, but the display is nice and it gives me a "separation of concerns" (its not my kid's laptop and it has only photo software...oh, and NetBeans for testing out parsers for DNG).

I have an older, slower computer in my house that is "always on". It is a 1GHz Gateway with dual drives. It has ALL of our MP3s on it (about another 34 GB). It serves as a base station for streaming content, iPod doc, etc... We also have a Sound Bridge in the house that streams from iTunes on the slower computer. I drag/drop files onto this computer so I have control over what gets moved (I am going to try the Adobe Photoshop Elements backup since I am sold on this application now...perhaps it would be faster than drag and drop since it knows what I changed).

Finally, I have another desktop that I use for the third copy of pictures and a second copy of MP3s, the backup uses Areca on this third computer with source directories on the 1 GHz machine. The 1 GHz is the most stable and having a separate media drive keeps the fragmentation low (I'm only adding information to it, not changing it).

I am predicting the EOL of my 1 GHz though, I will replace it with one of the low cost NAS devices on the market. Some of them give you an open platform to put streaming software on (or so I've heard) so I can increase the number of Sound Bridges in my house.

The DAM book I mention discusses configuration and setup of JBODs and such as your own digital archive.

Tagging pictures multiple times drives me nuts. I did just accidentally upload some RAW photos to SmugMug and it understood them! I was pretty shocked actually. DNG embeds the metadata IN the file rather than in a "sidecar" file. If there is a standard metadata model (not just a standard file format), you should be able to take a DNG file, upload it to SmugMug (or another tool) and have the metadata retained (I have to test this some time this month). I'm not sure we're their yet, but we shall see. The ideal stack, I think, would be: camera shoots and stores in DNG, application takes it in and stores it locally (using the OS content addressable storage API so other applications can use the power of the OS to store / retrieve), I tag and process the files I want and push them to SmugMug (still in DNG and via a network XAM protocol) where the tags remain in place. SmugMug uses a StorageTek 5800 Content Address Storage device to archive off their fast, tried and true RAID systems and also backs up to Amazon, where they also have a StorageTek 5800 using XAM for the digital archives.)

Posted by Paul Monday on December 09, 2007 at 11:58 PM MST #

Post a Comment:
Comments are closed for this entry.



« August 2016