File formats come, and file formats go. Strike that last part. File formats never really go away. People just stop storing data in them, and vendors stop supporting the formats in their products. Eventually the data is just a bunch of bits that nobody really cares about. (At least that’s how I feel about most of the papers that I wrote in college.)
While formats never really retire*, there’s a steady stream of rookies. Sometimes a format totally destroys the competition: PDF, JPEG, GIF, etc. (Being first helps, as does being in the right place at the right time.) Other times a new file format results from an actual deficiency for one community in an existing family of widely-used formats. Those formats — such as DNG, JPEG 2000, etc. — have rather more difficulty overcoming the inertia of the majority of data users’ workflows despite their superior qualities.
For example, DNG never really took off the way I had hoped. My Nikon D300′s RAW file is still NEF. As are all Nikon RAW files. And I’m not convinced that there are enough applications that support DNG in my workflow (beyond the obvious Adobe applications) for me to consider converting my .nef files to DNG on import. It’s a funny chicken and egg problem.
Add to this menagerie two new video file formats.
I don’t have a lot of video experience. Still photography was always more accessible and interesting to me, though I have to confess that I’ve been greatly enjoying editing the video from our trip to Australia. iMovie is surprisingly good at what it does, and the video coming out of my point-and-shoot camera is acceptable for reminiscing. I still like the story that a still photograph can tell, but video fits that niche that I always used to fill with babbling during my slide shows.
Anyway, I digress.
I don’t have a lot of video file format experience. Undoubtedly it’s more complicated than I know, but the sense I got was that there are a few widely used file formats — AVI, MPEG, Quicktime — with a variety of audio and video compression codecs, chroma subsampling settings, and bit depths thrown in to complicate what would otherwise be a very simple landscape.
Enter the consumer HD video revolution — partly thanks to a new generation of dSLR cameras — and it seems like we’re on the cusp of another explosion of proprietary file formats. Add in the demands of professional workflows, and you get two new file formats.
Storing, retrieving, and manipulating the RAW pixels in a video frame only goes so far. Eventually those frames are edited, cut and combined with audio tracks. Those frames and audio are mixed with other assets, such as subtitles, alternate audio tracks, time codes, and other metadata. Finally all of these assets are combined with a desired output intent to create a digital or film copy for cinema projection, a television broadcast, a DVD, streaming video, etc.
The Entertainment Technology Center at the University of Southern California (ETC) has worked with industry players to develop an interoperable master format (IMF) that encapsulates audio, video, and effects assets together with metadata and output profiles into a package. Basically IMF is the file-level portion of a digital asset management (DAM) solution.
The details of this encapsulating master format are quite numerous, but the following might be of interest to people who need to contemplate support for reading and writing the imagery portions of IMF. The format is evolving, but as of version 0.82a these were true.
- IMF is pretty permissive with respect to image dimensions, audio sampling frequencies, bit depths, and so on. There are a lot of “shoulds” in the spec.
- “Essence files” contain the video and audio assets.
- Essence files must use ISO or SMPTE standard formats. That’s good news. I hate the reinvention of the wheel.
- Frame rates must be constant.
- There are some required standard and nonstandard resolutions and frame rates.
- Non-1:1 pixel aspect ratios are OK.
- 8- and 10-bit samples must be supported, and I/O drivers should support 12- and 16-bit imagery, too.
- 4:4:4 and 4:2:2 chroma sampling is allowed.
- RGB-709, YCbCr-709, YCbCr-601, and CIE XYZ are supported color spaces.
- 3-D/stereoscopic imagery must be supported.
- Compression is recommended, especially visually/perceptually lossless methods (but not necessarily mathematically reversible).
- Compression must be industry standard and open. In fact, it probably should look a lot like JPEG-2000.
- Uncompressed data will look a lot like DPX or SMPTE 384M.
Once again this is just the tip of the iceberg of the details are in the draft document. If you like these or don’t agree with them or if you have other suggestions — such as specifying a particular set of options and metadata settings as a “baseline” — do download the spec yourself and comment.
* — For an example of a moribund format, consider PICT from Apple.