For my 9-5, I wrangle file formats, making it possible for people to read images into MATLAB and then do something useful with them. I’m always on the look out for what’s new, and there’s always something new. (I only joke about employing a legion of developers in Bangalore to develop new formats so that I can stay employed.) Lately I’ve been looking at Microsoft’s HD Photo format (a.k.a. Windows Media Photo, or WMP, format).

Once upon a time you could count on Microsoft’s image file formats to suck. Just take BMP as an example: a more-or-less unpublished format that changed between revisions of Windows and which you were supposed to access primarily through Windows API calls. File I/O people (like me) needed to have the bits of information that we could crib together handy so that we could figure out what datatype a given piece of data should have because the file didn’t carry that information in it. Nor was there a data dictionary to programmitcally look it up. They couldn’t have designed it worse if they had tried; but they clearly didn’t do any (good) design on the format. My professional opinion: totally sucked.

But perhaps they’ve learned the error of their ways with their new HD Photo format, which they have submitted for standardization. It appears that they’ve talked to a lot of digital image users about their needs and given a lot of thought to the format. I’m about halfway through the WinHEC 2007: HD Photo Implementation Guidelines presentation, which certainly says all the right things.

My only worry is that they’re saying so many of the right things that they’re possibly heading toward a couple of bad outcomes. Let’s call these the unintended consequences of designing a “good” format:

  1. The “Kitchen Sink Problem” (a.k.a. DICOM) — Everybody gets what they want, including things of questionable utility like CMYK + alpha and 5-6-5 or 10-10-10 RGB encoding. The problem is that format readers and writers have to decide what to support or use somebody else’s (possibly license-encumbered) code to read a simple file. The imperfect solution to this — which is what DICOM, NITF, and TIFF have done — is to create a mechanism for specifying compliance level.
  2. The “Infinite Configurability Problem” (a.k.a. JPEG-2000) — As a file I/O developer, you either have to take everything that comes your way or you have to tell pick what you think is important and hope that you pick the right set of features. HD Photo supports settings for chroma subsampling, overlap processing, frequency or spatial decoding priority, alpha interleaved by pixel or as a separate component, spatial transformations, gamut-management as part of the format, etc. These all have their place, but it might be a bit much. And JPEG-2000 has shown us that when you make a format too smart, it takes a long time to get adopted.

This brings me ’round to a thought I’ve had a lot recently. Image file formats have become platforms for working with data rather than containers for communicating images. The important thing is to get the data down on disk as quickly as possible and then change the interpretation of the pixels later. It looks like HD Photo might be self-contained, but some formats (like DNG or various HDR formats, including HD Photo) don’t include all of the information about how the image should look. If you add Adobe’s special sauce while reading the image in the DNG file, you’ll get a beautifully rendered image. If you use another vendor’s tools or provide other raw conversion settings, you will likely get an image that looks rather different. Similarly, there is no one correct rendering of an HDR image, and I don’t know of any formats that specify a preferred way of tone mapping.

It’s a brave new world of image file formats. We file format developers live in interesting times.