This dispatch is a bit of a valediction for me. Since early 2000, I’ve been one of the software engineers on the Image and Scientific Data Formats team at The MathWorks. I’ve learned a lot about an area of technical computing that rarely gets the limelight, which is too bad since file format support is the sine qua none for modern computing. As with any real-world discipline, communication and sharing data are the bases of getting anything done. Along the way I’ve also gained a lot of skills creating code, designing systems, and managing projects. And I’ve worked with some wonderful people. It’s been a really great experience, but an offer that was too good to refuse came along.
So now I’m a Senior Software Engineer in the Image Processing Group at The MathWorks. I still work in the same group with the same great people; only the projects have changed. Instead of programming file format interfaces, I’ll be working on software architecture and optimization. It’s definitely a growth opportunity for me, and I get to keep using a lot of the skills that I’ve gained over the last eight years.
But when you change jobs without changing offices, sometimes there’s a bit of overlap. And my interests haven’t changed radically; I just have more. So perhaps it’s not surprising that I’m still writing about JPEG here. Anyway . . . on with the show.
The most popular page on this web site covers the JPEG landscape. Believe it . . . or not. I’m not complaining. I just find it amusing that on a web site that touches on travel, photography, and (sometimes) software engineering, the most popular pages are about either the technical aspects of JPEG file formats or high dynamic range imaging. I guess that’s the price I pay for writing about the dozens of things that interest me.
Well, the JPEG family article has been gathering some really good comments. Most recently “pixpush” commented on a proprietary extension to classic JPEG that supports high dynamic range (HDR) and wide gamut imagery. And then he/she mused that JPEG would be even better if it could somehow support RAW data.
It would be great, but it’s never going to happen.*
Classic JPEG — the original JPEG that makes up all of our images — is what we might call “venerable.” There’s nothing really wrong with it. In fact, it’s very, very capable. But it’s an old dog with only so many tricks left in it. Unfortunately, the following things needed for RAW support are not part of its bag of tricks**:
- Lossless compression, which you absolutely need for so-called RAW imagery
- More than 8 bits per color component, since most cameras’ A2D converters use 10+ bits
- Wider gamuts than sRGB, which would require some combination of the following: converting to and from something other than YCbCr, using signed data, or somehow specifying the colorspace
It’s possible to put classic JPEG through its paces to do this, probably using the ill-supported lossless codec and extensions in new JPEG markers. (JPEG is a stream-oriented format — unlike TIFF — so you have to parse the stream for “markers” to find where new parts begin, making it hard to jump to “interesting” parts of the file.) But once you start making classic JPEG jump through those flaming hoops, you might as well go with one of the newer versions.
It’s unlikely that JPEG will “die out” in my lifetime. As long as there is data in a format it’s never really dead. (Unless, of course, no one knows what it means or the media dies.) But what format would I choose to replace it?
First, I’ll answer the question of what formats I like:
- TIFF. As long there are file systems that look like the ones we have today — files as sequential collections of bytes — the almost infinite extensibility of the Tagged Image File Format will be useful. You can put almost any kind of metadata into it now, and it’s user extensible (more or less). It supports a limitless number of samples per pixel, any bit depth you’d like, many colorspaces, ICC profiles, and a flotilla of compression modes. It’s also the basis of some vary capable formats such as DNG, and its data layout is used in EXIF, HD Photo/JPEG-XR, and other formats. TIFF and cockroaches will inherit the earth.
- DNG. Okay, so it’s more of a TIFF-based platform for describing RAW imagery than a traditional file format. You need to know how to interpret the format contents in order to get a viewable image, perhaps using a program like Adobe Camera RAW. Consequently, it’s possible for two applications to render the image quite differently. This is a very un-JPEG-like idea, but it brings back the flexibility and creativity of real-world negatives.
- JPEG-XR/HD Photo. I’ve written about this format before. It’s the heir-apparent to classic JPEG. And that’s not just because it’s from Microsoft.
- DICOM. Okay, okay. It has a lot of flaws. I mean, it can change byte order (endianness) in the same file . . . more than once. That’s messed up. To truly understand why it’s a good format, you’d have to be a trained professional, like me
or federation president Barry Fife.
- HDF5. If you absolutely must store gigabytes of data using datatypes that you define, arbitrary metadata, and multiple datasets organized in a hierarchical file structure, this is your format. Of course, you’ll need to use an API to access your data, but you’re payin’ the cost to be the boss.
No one format that will replace JPEG, but I fully expect that a small collection of semi-standardized formats (JPEG-XR, DNG, TIFF) are going to fill the ever-growing image space that it doesn’t support well. And we haven’t even touched on HDR. There isn’t a standard HDR format yet, and there’s a lot to work left to do. (I’m really curious to see whether the “standard” HDR image format will include a preferred tone mapping method or whether it will just be a platform for imagery like DNG.)
Let’s see where the future takes file formats and me. . . . Stay tuned.
* — Except maybe as a joke or programming assignment.
** — Can you mix dog and cat metaphors like that?