Microsoft HD Photo

For my 9-5, I wrangle file formats, making it possible for people to read images into MATLAB and then do something useful with them. I’m always on the look out for what’s new, and there’s always something new. (I only joke about employing a legion of developers in Bangalore to develop new formats so that I can stay employed.) Lately I’ve been looking at Microsoft’s HD Photo format (a.k.a. Windows Media Photo, or WMP, format).

Once upon a time you could count on Microsoft’s image file formats to suck. Just take BMP as an example: a more-or-less unpublished format that changed between revisions of Windows and which you were supposed to access primarily through Windows API calls. File I/O people (like me) needed to have the bits of information that we could crib together handy so that we could figure out what datatype a given piece of data should have because the file didn’t carry that information in it. Nor was there a data dictionary to programmitcally look it up. They couldn’t have designed it worse if they had tried; but they clearly didn’t do any (good) design on the format. My professional opinion: totally sucked.

But perhaps they’ve learned the error of their ways with their new HD Photo format, which they have submitted for standardization. It appears that they’ve talked to a lot of digital image users about their needs and given a lot of thought to the format. I’m about halfway through the WinHEC 2007: HD Photo Implementation Guidelines presentation, which certainly says all the right things.

My only worry is that they’re saying so many of the right things that they’re possibly heading toward a couple of bad outcomes. Let’s call these the unintended consequences of designing a “good” format:

  1. The “Kitchen Sink Problem” (a.k.a. DICOM) — Everybody gets what they want, including things of questionable utility like CMYK + alpha and 5-6-5 or 10-10-10 RGB encoding. The problem is that format readers and writers have to decide what to support or use somebody else’s (possibly license-encumbered) code to read a simple file. The imperfect solution to this — which is what DICOM, NITF, and TIFF have done — is to create a mechanism for specifying compliance level.
  2. The “Infinite Configurability Problem” (a.k.a. JPEG-2000) — As a file I/O developer, you either have to take everything that comes your way or you have to tell pick what you think is important and hope that you pick the right set of features. HD Photo supports settings for chroma subsampling, overlap processing, frequency or spatial decoding priority, alpha interleaved by pixel or as a separate component, spatial transformations, gamut-management as part of the format, etc. These all have their place, but it might be a bit much. And JPEG-2000 has shown us that when you make a format too smart, it takes a long time to get adopted.

This brings me ’round to a thought I’ve had a lot recently. Image file formats have become platforms for working with data rather than containers for communicating images. The important thing is to get the data down on disk as quickly as possible and then change the interpretation of the pixels later. It looks like HD Photo might be self-contained, but some formats (like DNG or various HDR formats, including HD Photo) don’t include all of the information about how the image should look. If you add Adobe’s special sauce while reading the image in the DNG file, you’ll get a beautifully rendered image. If you use another vendor’s tools or provide other raw conversion settings, you will likely get an image that looks rather different. Similarly, there is no one correct rendering of an HDR image, and I don’t know of any formats that specify a preferred way of tone mapping.

It’s a brave new world of image file formats. We file format developers live in interesting times.

This entry was posted in Computing, File Formats, Fodder for Techno-weenies, Photography. Bookmark the permalink.

5 Responses to Microsoft HD Photo

  1. Peter Murray says:

    Just for clarification, when you speak of JPEG2000 in your posting, do you mean JPEG2000 as it was known before Microsoft HD Photo came along? Or do you mean JPEG-XR, the Microsoft-contributed specification that was recently moved to ballot status as a committee draft?

  2. Jeff Mather says:

    Hi Peter,

    I mean the real JPEG-2000 standard (ISO/IEC 15444-1 and i5444-2) — the one that preceeded HD Photo. JPEG-2000 is a very good compression standard and format, but it has the most complicated interaction between compression and image storage that I have ever seen.

    I have only recently started looking at JPEG-XR, but some JPEG folks don’t think much of it. I have no opinion yet.

  3. Peter Murray says:

    Thanks for the clarification, Jeff. I’m particularly interested in JPEG2000 as a preservation format for cultural heritage materials. It’s nature as an open standard, lossless compression, and infinitely flexible metadata boxes makes it ideally suited as a replacement for my community’s current TIFF practice. As such, I have some concerns about some aspects of Microsoft Photo HD.

  4. Jeff Mather says:

    I have some experience implementing JPEG-2000 and think that it would make a good archival format. The pros of using JPEG-2000:

    • All of the things you noted: open standards, lossless compression, infinitely flexible metadata boxes.
    • It’s not patent-bound and has been implemented by several vendors on many platforms (unlike HD Photo).
    • It supports high bit-depth images (such as 16 bits/channel).
    • The lossy modes use wavelets that produce virtually no artifacts if you use the right quality settings, and the artifacts that show up at lower bit-rates look much better than classic JPEG. (Medical image archivists prefer lossy JPEG-2000 to lossy JPEG.)
    • Image storage can be optimized in a variety of ways for different disciplines. It’s possible to transcode the images between these optimizations without recompressing the data.
    • Different parts of the image can have different quality settings.
    • You can store an image at a very high quality level — producing a rather large archive copy — and then decode it at lower quality settings, which might be useful when streaming data to the web or generating previews, for example.

    There are a couple of cons:

    • It’s a very complicated format, which means that you’ll need to license a commercial product to develop a new application that uses many of its features. Jasper, an open-source solution, is just too slow. I recommend Kakadu.
    • It hasn’t been widely adopted in the consumer realm. This means that you’ll likely need to provide JPEG 2000 viewers to your end users or convert the archive images to another format if you’re serving them over the web.
  5. Jeff Mather says:

    UPDATE: After reading more of that thread about the new DCT-based JPEG proposal (T.851) put forward by some members of the IJG, I’m far less inclined to lend credence to their complaints about JPEG-XR and prospects for a “new” JPEG that isn’t JPEG-XR.

    In particular, JPEG-XR has better file size performance, better PSNR performance, (probably) better computational performance; is less blocky than the DCT-based “new” JPEG; supports more bit-depths; and actually exists in an implemented and tested form.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>