<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Jeff Mather&#039;s Dispatches &#187; File Formats</title>
	<atom:link href="http://jeffmatherphotography.com/dispatches/category/file-formats/feed/" rel="self" type="application/rss+xml" />
	<link>http://jeffmatherphotography.com/dispatches</link>
	<description>The Post-9-to-5 Life of an International Playboy</description>
	<lastBuildDate>Wed, 23 May 2012 13:23:38 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.1</generator>
		<item>
		<title>My Insulin Pump Hacker Name is &#8220;Glux0se&#8221;</title>
		<link>http://jeffmatherphotography.com/dispatches/2011/08/my-insulin-pump-hacker-name-is-glux0se/</link>
		<comments>http://jeffmatherphotography.com/dispatches/2011/08/my-insulin-pump-hacker-name-is-glux0se/#comments</comments>
		<pubDate>Thu, 11 Aug 2011 00:06:35 +0000</pubDate>
		<dc:creator>Jeff Mather</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[Data-betes]]></category>
		<category><![CDATA[Diabetes]]></category>
		<category><![CDATA[File Formats]]></category>
		<category><![CDATA[Fodder for Techno-weenies]]></category>

		<guid isPermaLink="false">http://jeffmatherphotography.com/dispatches/?p=2507</guid>
		<description><![CDATA[Victoria wrote an excellent piece on her site about what&#8217;s become known as &#8220;pumphackingate.&#8221; In it, she gives a brief recap of the facts and some of the reactions that have appeared on other blogs. Here&#8217;s an even briefer recap, &#8230; <a href="http://jeffmatherphotography.com/dispatches/2011/08/my-insulin-pump-hacker-name-is-glux0se/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Victoria wrote <a href="http://victoriacumbow.com/2011/08/09/1133/">an excellent piece</a> on her site about what&#8217;s become known as &#8220;pumphackingate.&#8221;  In it, she gives a brief recap of the facts and some of the reactions that have appeared on other blogs.  Here&#8217;s an even briefer recap, in case you don&#8217;t know anything about it: Some hacker/builder dude created a device that can control some insulin pumps remotely along with gathering data from them.  Based on a comment I left over on Victoria&#8217;s site, here&#8217;s my take on the issue.</p>
<p>First off, I&#8217;m not surprised.  Like any device that transmits and receives wirelessly, the signals from pumps and CGMs are interceptable.  Furthermore, like any other device that communicates with limited access control&mdash;you just need to know (or sniff out or be able to guess) the six or seven digit code that&#8217;s used to connect with another device&mdash;they&#8217;re essentially open.  From there it&#8217;s all just figuring out the protocols and the format of the data as it&#8217;s passed around.  As someone who <a href="http://jeffmatherphotography.com/dispatches/2008/04/beyond-jpeg/">spent about ten years</a> working with and occasionally reverse-engineering formats, I can tell you, it&#8217;s all just a matter of trial and error and careful observation.  (If I were a hacker, my handle would be &#8220;gluX0se.&#8221;)</p>
<p>So, in a world where relatively few people have these medical devices&mdash;unlike, say, <a href="http://thetechjournal.com/electronics/mobile/new-antenna-can-eavesdrop-into-your-private-cellphone-calls.xhtml">mobile phones</a> or <a href="http://techblog.aasisvinayak.com/using-a-bluetooth-headset-beware-of-eavesdropping/">bluetooth devices</a>&mdash;the device manufacturers essentially did the easy thing, which was to assume we use our medical devices in a trustable world where people don&#8217;t mess with medical devices.  (BTW, who knew there was a free <i><a href="http://www.ntc-eg.com/PDFs/VM-for-Dummies.pdf">Vulnerability Management for Dummies</a></i> e-book?)</p>
<p>There&#8217;s been a lot of unease in the community about the way that the information was presented to the press and the way that some outlets sensationalized it (<i>e.g.</i>, <a href="http://blogs.computerworld.com/18744/black_hat_lethal_hack_and_wireless_attack_on_insulin_pumps_to_kill_people">&#8220;Black Hat: Lethal Hack and wireless attack on insulin pumps to kill people&#8221;</a>).  It&#8217;s hard not to agree with a lot of the criticism there.  But I can&#8217;t criticize looking for security holes in medical devices.  Nor can I fault the impulse to hack into own&#8217;s own medical device&mdash;even one that keeps people alive&mdash;or to help other people hack their devices.  Not all hacking is scary villainy, but this incident certainly exposes some problems.</p>
<p>Using the AP to share this information leaves a bad taste in my mouth, but presenting the findings at the <a href="http://www.blackhat.com/">Black Hat Conference</a> seems like the most appropriate way to publicly disclose this research. (And it is, in my mind, legitimate personal security research that should be shared openly.) I would have preferred that Radcliffe work more closely with the device manufacturers leading up to the announcement. (I’m assuming that he did not.)</p>
<p>On the other hand, just presenting the findings to the device manufacturers&mdash;as some would have liked&mdash;violates the hacker ethos, both the black hat and white hat versions. Part of hacking&mdash;the part that I can get down with&mdash;is when <a href="http://bildr.org/">motivated hobbyists exploit technology</a> to solve a problem (real or imagined). I have thought many times how great it would be to sniff the unprotected data that’s transmitted by my pump/CGM and skip the middleman of uploading data to a web site. I’ve even gone so far as to seek out the information that Radcliffe presented, but it wasn’t available at the time.</p>
<p>Device manufacturers limit our access to our own medical data and tightly control the way that we can interact with our devices. It’s understandable given the limitations put on them by the FDA, their own desire to help (not harm) customers/patients, and their lawyers’ desire to limit risk exposure. It does mean, though, that the enormous potential for third-party, patient-focused tools goes untapped.  Those tools could benefit so much from being able to present data the way that their users want to see them: A dashboard light in a car, a desktop computer widget that display CGM values, <a href="http://jeffmatherphotography.com/dispatches/2011/02/total-diabetes-awareness-the-app/">a mobile app</a> that records all of the data for later use, a device that calls parents of children with diabetes when something happens, an awesome mood ring displaying BG, etc.</p>
<p>I suspect (and once again I’m assuming here) that Radcliffe was intrigued by the rather obvious possibilities of unprotected communication, and that’s getting lost in the whole “malicious people ruining diabetics’ lives” reporting. I fear the notoriety this incident is garnering is going to scare manufacturers into closing exploitable security holes without providing a secure, replacement method for getting at all of that data. And that’s a shame.</p>
]]></content:encoded>
			<wfw:commentRss>http://jeffmatherphotography.com/dispatches/2011/08/my-insulin-pump-hacker-name-is-glux0se/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>JPEGmini v. The Man</title>
		<link>http://jeffmatherphotography.com/dispatches/2011/01/jpegmini-v-the-man/</link>
		<comments>http://jeffmatherphotography.com/dispatches/2011/01/jpegmini-v-the-man/#comments</comments>
		<pubDate>Mon, 24 Jan 2011 04:22:48 +0000</pubDate>
		<dc:creator>Jeff Mather</dc:creator>
				<category><![CDATA[File Formats]]></category>
		<category><![CDATA[Fodder for Techno-weenies]]></category>

		<guid isPermaLink="false">http://jeffmatherphotography.com/dispatches/?p=1662</guid>
		<description><![CDATA[UPDATE: Be sure to read the comments. There&#8217;s no new file format. Real low on the world&#8217;s priority list is a patent-pending image compression algorithm and format that attempts to replace JPEG. (I&#8217;ve written why before, more than once.) But &#8230; <a href="http://jeffmatherphotography.com/dispatches/2011/01/jpegmini-v-the-man/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><b>UPDATE:</b> <i>Be sure to read the comments.  There&#8217;s no new file format.</i></p>
<p>Real low on the world&#8217;s priority list is a patent-pending image compression algorithm and format that attempts to replace JPEG.  (I&#8217;ve <a href="http://jeffmatherphotography.com/dispatches_wp/wp-admin/post.php?action=edit&#038;post=380">written why</a> before, <a href="http://jeffmatherphotography.com/dispatches/2008/04/beyond-jpeg/">more than once</a>.)  But just in case I&#8217;m wrong &mdash; after all, JPEG-XR hasn&#8217;t really taken off like I thought it would three years ago &mdash; here&#8217;s a link to <a href="http://jpegmini.com/">JPEGmini</a>.</p>
<p>The makers say <a href="http://twitter.com/jpegmini">via Twitter</a> that they&#8217;ll be at the same conference that I&#8217;m at right now.  I will report more when I have details.  (I sure hope these aren&#8217;t the same folks that I tried to talk out of adding a new JPEG format some years ago.  Gosh, that would be awkward.)</p>
]]></content:encoded>
			<wfw:commentRss>http://jeffmatherphotography.com/dispatches/2011/01/jpegmini-v-the-man/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>New Video File Formats</title>
		<link>http://jeffmatherphotography.com/dispatches/2010/08/new-video-file-formats/</link>
		<comments>http://jeffmatherphotography.com/dispatches/2010/08/new-video-file-formats/#comments</comments>
		<pubDate>Thu, 26 Aug 2010 19:46:23 +0000</pubDate>
		<dc:creator>Jeff Mather</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[File Formats]]></category>
		<category><![CDATA[Fodder for Techno-weenies]]></category>
		<category><![CDATA[Photography]]></category>
		<category><![CDATA[Video]]></category>

		<guid isPermaLink="false">http://jeffmatherphotography.com/dispatches/?p=1051</guid>
		<description><![CDATA[File formats come, and file formats go. Strike that last part. File formats never really go away. People just stop storing data in them, and vendors stop supporting the formats in their products. Eventually the data is just a bunch &#8230; <a href="http://jeffmatherphotography.com/dispatches/2010/08/new-video-file-formats/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>File formats come, and file formats go.  Strike that last part.  File formats never really go away.  People just stop storing data in them, and vendors stop supporting the formats in their products.  Eventually the data is just a bunch of bits that nobody really cares about.  (At least that&#8217;s how I feel about most of the papers that I wrote in college.)</p>
<p>While formats never really retire*, there&#8217;s a steady stream of rookies.  Sometimes a format totally destroys the competition: PDF, JPEG, GIF, etc.  (Being first helps, as does being in the right place at the right time.)  Other times a new file format results from an actual deficiency for one community in an existing family of widely-used formats.  Those formats &mdash; such as DNG, JPEG 2000, etc. &mdash; have rather more difficulty overcoming the inertia of the majority of data users&#8217; workflows despite their superior qualities.</p>
<p>For example, DNG never really took off the way I had hoped.  My Nikon D300&#8242;s RAW file is still NEF.  As are all Nikon RAW files.  And I&#8217;m not convinced that there are enough applications that support DNG in my workflow (beyond the obvious Adobe applications) for me to consider converting my .nef files to DNG on import.  It&#8217;s a funny chicken and egg problem.</p>
<p>Add to this menagerie two new video file formats.</p>
<p>I don&#8217;t have a lot of video experience.  Still photography was always more accessible and interesting to me, though I have to confess that I&#8217;ve been greatly enjoying editing the video from our trip to Australia.  iMovie is surprisingly good at what it does, and the video coming out of my point-and-shoot camera is acceptable for reminiscing.  I still like the story that a still photograph can tell, but video fits that niche that I always used to fill with babbling during my slide shows.</p>
<p>Anyway, I digress.</p>
<p>I don&#8217;t have a lot of video <i>file format</i> experience.  Undoubtedly it&#8217;s more complicated than I know, but the sense I got was that there are a few widely used file formats &mdash; AVI, MPEG, Quicktime &mdash; with a variety of audio and video compression codecs, chroma subsampling settings, and bit depths thrown in to complicate what would otherwise be a very simple landscape.</p>
<p>Enter the consumer HD video revolution &mdash; partly thanks to a new generation of dSLR cameras &mdash; and it seems like we&#8217;re on the cusp of another explosion of proprietary file formats.  Add in the demands of professional workflows, and you get two new file formats.</p>
<p>Just as it did with <a href="http://www.adobe.com/products/dng/">DNG</a> for still cameras, Adobe is proposing <a href="http://labs.adobe.com/technologies/cinemadng/">CinemaDNG</a> as an open file format for storing RAW files from digital video cameras.</p>
<p>Storing, retrieving, and manipulating the RAW pixels in a video frame only goes so far.  Eventually those frames are edited, cut and combined with audio tracks.  Those frames and audio are mixed with other assets, such as subtitles, alternate audio tracks, time codes, and other metadata.  Finally all of these assets are combined with a desired output intent to create a digital or film copy for cinema projection, a television broadcast, a DVD, streaming video, etc.</p>
<p>The <a href="http://www.etcenter.org/imf-spec/">Entertainment Technology Center</a> at the University of Southern California (ETC) has worked with industry players to develop an <a href="http://createasphere.com/En/insider-view/1991-interoperable-master-format-aims-to-take-industry-into-a-file-based-world.html">interoperable master format</a> (IMF) that encapsulates audio, video, and effects assets together with metadata and output profiles into a package.  Basically IMF is the file-level portion of a digital asset management (DAM) solution.</p>
<p>The details of this encapsulating master format are quite numerous, but the following might be of interest to people who need to contemplate support for reading and writing the imagery portions of IMF.  The format is evolving, but as of version 0.82a these were true.</p>
<ul>
<li>IMF is pretty permissive with respect to image dimensions, audio sampling frequencies, bit depths, and so on.  There are a lot of &#8220;shoulds&#8221; in the spec.</li>
<li>&#8220;Essence files&#8221; contain the video and audio assets.</li>
<li>Essence files must use ISO or SMPTE standard formats.  That&#8217;s good news.  I hate the reinvention of the wheel.</li>
<li>Frame rates must be constant.</li>
<li>There are some required standard and nonstandard resolutions and frame rates.</li>
<li>Non-1:1 pixel aspect ratios are OK.</li>
<li>8- and 10-bit samples must be supported, and I/O drivers should support 12- and 16-bit imagery, too.</li>
<li>4:4:4 and 4:2:2 chroma sampling is allowed.</li>
<li>RGB-709, YCbCr-709, YCbCr-601, and CIE XYZ are supported color spaces.</li>
<li>3-D/stereoscopic imagery must be supported.</li>
<li>Compression is recommended, especially visually/perceptually lossless methods (but not necessarily mathematically reversible).</li>
<li>Compression must be industry standard and open.  In fact, it probably should look a lot like JPEG-2000.</li>
<li>Uncompressed data will look a lot like <a href='http://www.mathworks.com/matlabcentral/fileexchange/9683'>DPX</a> or <a href="http://en.wikipedia.org/wiki/Material_Exchange_Format">SMPTE 384M</a>.</li>
</ul>
<p>Once again this is just the tip of the iceberg of the details are in the draft document.  If you like these or don&#8217;t agree with them or if you have other suggestions &mdash; such as specifying a particular set of options and metadata settings as a &#8220;baseline&#8221; &mdash; do <a href="http://www.etcenter.org/imf-spec/">download the spec</a> yourself and comment.</p>
<p><br clear="all" />* &mdash; For an example of a moribund format, consider <a href='http://www.prepressure.com/library/file-formats/pict'>PICT</a> from  Apple.</p>
]]></content:encoded>
			<wfw:commentRss>http://jeffmatherphotography.com/dispatches/2010/08/new-video-file-formats/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The &#8220;Blow Stuff Up&#8221; Conference</title>
		<link>http://jeffmatherphotography.com/dispatches/2010/05/the-blow-shit-up-conference/</link>
		<comments>http://jeffmatherphotography.com/dispatches/2010/05/the-blow-shit-up-conference/#comments</comments>
		<pubDate>Wed, 19 May 2010 00:32:43 +0000</pubDate>
		<dc:creator>Jeff Mather</dc:creator>
				<category><![CDATA[File Formats]]></category>
		<category><![CDATA[General]]></category>
		<category><![CDATA[This is who we are]]></category>

		<guid isPermaLink="false">http://jeffmatherphotography.com/dispatches/?p=792</guid>
		<description><![CDATA[In case you wondered what that post from earlier today was all about, perhaps a picture will help: Click for larger&#160;.&#160;.&#160;. This envelope came in the mail yesterday. I don&#8217;t know who put me onto this mailing list, but I&#8217;m &#8230; <a href="http://jeffmatherphotography.com/dispatches/2010/05/the-blow-shit-up-conference/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>In case you wondered what that post from earlier today was all about, perhaps a picture will help:</p>
<p><a href="/images/IEDconf.jpg"><img src="/images/t_IEDconf.jpg" alt="'Blow Shit Up' conference announcement" /></a><br clear="all" /><a href="">Click for larger&nbsp;.&nbsp;.&nbsp;.</a></p>
<p>This envelope came in the mail yesterday.  I don&#8217;t know who put me onto this mailing list, but I&#8217;m pretty sure it&#8217;s related to the work I&#8217;ve done over the last few years supporting the NITF file format, whose users are an interesting lot.  They don&#8217;t really like to talk about what they do or what they keep in their files: secret stuff mostly.</p>
<p>I&#8217;m not one to judge.  I&#8217;ll just say that I&#8217;m very glad that I was also responsible for adding support for the DICOM medical imaging format to MATLAB.</p>
]]></content:encoded>
			<wfw:commentRss>http://jeffmatherphotography.com/dispatches/2010/05/the-blow-shit-up-conference/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A Menagerie of Image File Formats</title>
		<link>http://jeffmatherphotography.com/dispatches/2010/05/a-menagerie-of-image-file-formats/</link>
		<comments>http://jeffmatherphotography.com/dispatches/2010/05/a-menagerie-of-image-file-formats/#comments</comments>
		<pubDate>Tue, 18 May 2010 21:02:58 +0000</pubDate>
		<dc:creator>Jeff Mather</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[File Formats]]></category>
		<category><![CDATA[Fodder for Techno-weenies]]></category>
		<category><![CDATA[Life Lessons]]></category>

		<guid isPermaLink="false">http://jeffmatherphotography.com/dispatches/?p=778</guid>
		<description><![CDATA[This is a follow-up to my recent post on parsing NITF files that contain JPEG data. It&#8217;s basically a crash course into the organization of the guts of image file formats. If I were ever asked to be an expert &#8230; <a href="http://jeffmatherphotography.com/dispatches/2010/05/a-menagerie-of-image-file-formats/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><i>This is a follow-up to my recent post on <a href="http://jeffmatherphotography.com/dispatches/2010/05/nitf-jpeg/">parsing NITF files that contain JPEG data</a>.  It&#8217;s basically a crash course into the organization of the guts of image file formats.  If I were ever asked to be an expert witness in a trial, it would probably be about file formats.*  This is the area of my expertise.</i></p>
<p>You can divide the world of image file formats into different kingdoms based on the their structure.  There is some overlap between these categories, but for the most part image formats are (1) tag/record-based, (2) structure-like, (3) marker/stream-based, (4) textual, (5) card-like, (6) raw, or (7) opaque.</p>
<p>TIFF, DNG, and DICOM are examples of tag/record-based formats.  A unique tag identifies the entity in the file and its meaning.  For example, a particular hexadecimal tag might indicate that this is the &#8220;photometric interpretation&#8221; record.  The datatype of this record either explicitly appears after the tag or appears in a data dictionary that&#8217;s known to the application developer.  Almost always, these records explicitly tell the length of their data, which makes it easy to skip to the tag location of the next record.</p>
<p>Microsoft was (for a time) very fond of making structure-like formats.  In these formats, the file looks a lot like the memory representation of a C/C++ data structure.  These formats are easy to describe and easy to read if you have the structure definition; simply <tt>fread()</tt> the data into a variable and reference the data members by name.  The problems should be pretty clear.  You need to be using a programming language that supports C structs.  And you need to know the layout of the struct.  And once you define the layout of the struct, it&#8217;s fixed.  (Well, not exactly.  Microsoft changed the data layout in its BMP family of formats with every release of Windows, and used a &#8220;magic&#8221; value to tell readers which struct to use.)  All told, it&#8217;s a very brittle kind of format.</p>
<p>JPEG is the prototypical &mdash; but certainly not the only &mdash; marker-based format.  Markers are special combinations of bytes that, like a tag, tell what the data is that&#8217;s coming next in the stream.  But, very much like struct-based formats and very unlike tagged formats, the data that follows the marker can be heterogeneous.  In JPEG, the data that appears after the SOF (Start of Frame) marker is a record, while the data that follows an RST<i>n</i> marker is just a stream of compressed bytes.  The SOI and EOI (Start/End of image) markers don&#8217;t even have any bytes that follow them.  In marker-based formats, semantics and syntax are rather carelessly jumbled together.</p>
<p>It&#8217;s very difficult to quickly parse marker-based formats, because often markers don&#8217;t specify how much data appears before the next marker.  These are very much &#8220;streams&#8221; of bytes that you&#8217;re forced to read until you come to the next marker.  Consequently the number and appearance of markers is very limited and this limitation ripples through to the data that they contain.  JPEG markers all begin with the <tt>0xFF</tt> byte followed by another byte, which taken together specify which marker it is.  Consequently, the appearance of an <tt>0xFF</tt> byte in the data of a marker has to be escaped by a NULL byte so that it&#8217;s not mistaken for the next marker.</p>
<p>Textual formats, such as XML, have the benefit of being self-describing and readable by both humans and machines.  Their main drawbacks are the inflated size of the data they contain (even when represented in a semi-binary <tt>CDATA</tt> hunk) and the inability to quickly skip through them with binary I/O routines.</p>
<p>FITS is a fairly prototypical &#8220;card-like&#8221; format.  As the name implies, these are fixed-length records like one might have encountered on a punch card.  For example in format with 120-character records, the first <i>n</i> characters are reserved for the &#8220;variable name&#8221; part of the equation, while the remaining 120-<i>n</i> characters are the textual representation of the value of the record.  They are frequently text-only for the descriptive part of the format with a binary payload at the end.  These are easy to read, but a pain to parse, since the &#8220;right hand side&#8221; values often have to be interpreted.</p>
<p>Raw and opaque formats aren&#8217;t very easy to describe because they&#8217;re so varied.  In a &#8220;raw&#8221; format (and there are dozens or hundreds&nbsp;.&nbsp;.&nbsp;. possibly more) all of the bytes are jumbled together in a payload-only file.  A separate file may have a header that describes the data and helps a reader/parser make sense of the payload.  Or not.  These are almost always completely free of any helpful description within the file.</p>
<p>This shouldn&#8217;t be confused with opaque files, such as HDF, CDF, or netCDF.  These formats are completely defined by their API, which for all intents and purposes, you have to use to access the data within the file.  This allows for a lot of richness in handling the data contents, which can be organized in highly optimized ways.  The downside is that you&#8217;re limited in how you can interact with your data to mechanisms someone else has defined.  And data permanence can suffer, since if the tool chain changes (or goes out of existence) you don&#8217;t really have a way to get at your data.</p>
<p>Practically, each format style has it&#8217;s pros and cons.  But tagged formats (which might incorporate features of the record style) are the most durable and easiest for third-parties to work with.</p>
<p><br clear="all" />* &mdash; Cue awesome &#8220;CSI&#8221; + &#8220;Law and Order&#8221; + &#8220;House&#8221; mashup daydream.  *DOINK DOINK*</p>
]]></content:encoded>
			<wfw:commentRss>http://jeffmatherphotography.com/dispatches/2010/05/a-menagerie-of-image-file-formats/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>NITF + JPEG</title>
		<link>http://jeffmatherphotography.com/dispatches/2010/05/nitf-jpeg/</link>
		<comments>http://jeffmatherphotography.com/dispatches/2010/05/nitf-jpeg/#comments</comments>
		<pubDate>Tue, 18 May 2010 21:01:26 +0000</pubDate>
		<dc:creator>Jeff Mather</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[File Formats]]></category>
		<category><![CDATA[Fodder for Techno-weenies]]></category>
		<category><![CDATA[Life Lessons]]></category>

		<guid isPermaLink="false">http://jeffmatherphotography.com/dispatches/?p=737</guid>
		<description><![CDATA[I&#8217;ve recently been working with streams of JPEG data inside of NITF files. Given my experience supporting I/O involving DICOM files that contain JPEG-compressed imagery, I was extremely surprised to learn how difficult it is to read JPEG from &#8220;National &#8230; <a href="http://jeffmatherphotography.com/dispatches/2010/05/nitf-jpeg/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve recently been working with streams of JPEG data inside of NITF files.  Given my experience supporting I/O involving DICOM files that contain JPEG-compressed imagery, I was extremely surprised to learn how difficult it is to read JPEG from &#8220;National Imagery Transmission Format&#8221; files.  This post exists to help the next person who needs to read JPEG data embedded in NITF or another file format.</p>
<p>My naïve idea was to copy the JPEG-encoded to a temporary file and then read that file using the <a href="http://www.ijg.org/">Independent JPEG Group</a>&#8216;s libjpeg library.  That&#8217;s what I did with JPEG data encapsulated in DICOM.  This is far too simple an approach for NITF, resulting in incomplete images.  Here&#8217;s why:</p>
<ul>
<li>NITF breaks most images into multiple tiles.</li>
<li>Each tile is independently compressed into its own image stream.</li>
<li>NITF uses &#8220;block masking,&#8221; which prevents storing <i>unimportant</i> tiles.</li>
</ul>
<p>The idea makes sense on one level.  If you&#8217;re going to send an image over a low-bandwidth or low-fidelity channel, you want to limit the amount of data that you send, and you want to avoid an all-or-nothing situation during image transmission or reception.  But it&#8217;s a total pain in the ass for application developers.</p>
<p>Add to this the fact that JPEG is a marker-based format that isn&#8217;t very self-describing, and you have a tricky parsing situation.*</p>
<p>Here&#8217;s the basic idea behind getting imagery out of a NITF file if it&#8217;s been JPEG compressed.  (I assume that you already know how to parse a NITF file &mdash; see <a href="http://www.gwg.nga.mil/ntb/baseline/docs/188_198a/index.html">MIL-STD-188-198A</a> if you don&#8217;t &mdash; and that you have a JPEG codec that you can use to decode the data.)</p>
<ol>
<li>The first two bytes of the compressed stream should be the standard JPEG <tt>SOI</tt> marker (0xFF 0xD8).  This is your sanity check.</li>
<li>The next two bytes should be the <tt>APP6</tt> marker (0xFF 0xE6).  The payload of this marker contains a bunch of useful information about tile sizes and counts, bit depths, etc.  Some of this is redundant with what&#8217;s inside the NITF file.</li>
<li>The remainder of the NITF file should be a bunch of JPEG codestreams delimited by <tt>SOI</tt> and <tt>EOI</tt> (0xFF 0xD9) markers.  Each delimited stream is one tile in the image; and it&#8217;s a completely standalone JPEG stream.  It can be extracted to its own file (if necessary) and decompressed.  Tiles are stored across the image horizontally and then down.</li>
<li>If there&#8217;s no block masking, it suffices to read each tile in turn and store it in the appropriate region in the output image.</li>
<li>If the NITF file does use block masking, use the values in the <tt>BMRnBNDm</tt> attribute of the image subheader to find the locations of the blocks that contain actual image data.  The masked out blocks will have 0xFFFFFFFF values.  The other values &mdash; there&#8217;s one for each tile &mdash; are 0-based offsets pointing to the <tt>SOI</tt> marker that starts each tile, relative to the start of the JPEG compressed data.</li>
</ol>
<p>And that&#8217;s it.  After coding, you should probably test out your parser on <a href="http://www.gwg.nga.mil/ntb/baseline/software/testfile/Nitfv2_1/scen_2_1.html">the sample NITF files</a> provided by the <a href="http://en.wikipedia.org/wiki/National_Geospatial-Intelligence_Agency" title="National Geospatial-Intelligence Agency">NGA</a>.</p>
<p><br clear="all" />* &#8211; You can divide the world of image file formats into different buckets based on the their structure.  There is some overlap between these categories, but for the most part image formats are (1) tag/record-based, (2) structure-like, (3) marker/stream-based, (4) textual, (5) card-like, (6) raw, or (7) opaque.  I&#8217;m going to <a href='http://jeffmatherphotography.com/dispatches/2010/05/a-menagerie-of-image-file-formats/'>write more about this in the next post</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://jeffmatherphotography.com/dispatches/2010/05/nitf-jpeg/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Security Vulnerability in CDF, plus a MATLAB Fix</title>
		<link>http://jeffmatherphotography.com/dispatches/2008/05/security-vulnerability-in-cdf-plus-a-matlab-fix/</link>
		<comments>http://jeffmatherphotography.com/dispatches/2008/05/security-vulnerability-in-cdf-plus-a-matlab-fix/#comments</comments>
		<pubDate>Tue, 06 May 2008 13:04:24 +0000</pubDate>
		<dc:creator>Jeff Mather</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[File Formats]]></category>
		<category><![CDATA[MATLAB]]></category>

		<guid isPermaLink="false">http://jeffmatherphotography.com/dispatches/2008/05/security-vulnerability-in-cdf-plus-a-matlab-fix/</guid>
		<description><![CDATA[The CDF folks at Goddard Space Flight Center have identified a security vulnerability &#8212; a buffer overflow to be specific &#8212; that can enable the execution of arbitrary code on your machine if you open a particular malformed file. If &#8230; <a href="http://jeffmatherphotography.com/dispatches/2008/05/security-vulnerability-in-cdf-plus-a-matlab-fix/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>The CDF folks at Goddard Space Flight Center have identified a security vulnerability &mdash; a buffer overflow to be specific &mdash; that can enable the execution of arbitrary code on your machine if you open a particular malformed file.  If you&#8217;re accessing CDF files via MATLAB, you can <a href="http://cdf.gsfc.nasa.gov/html/matlab_cdf_patch.html">download a security patch</a> from NASA GSFC.</p>
<p>Thank you.  That is all.</p>
<p><b>UPDATE:</b> You can also <a href="http://www.mathworks.com/support/bugreports/details.html?rp=463427">download an update directly from The MathWorks</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://jeffmatherphotography.com/dispatches/2008/05/security-vulnerability-in-cdf-plus-a-matlab-fix/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Beyond JPEG</title>
		<link>http://jeffmatherphotography.com/dispatches/2008/04/beyond-jpeg/</link>
		<comments>http://jeffmatherphotography.com/dispatches/2008/04/beyond-jpeg/#comments</comments>
		<pubDate>Wed, 16 Apr 2008 02:21:20 +0000</pubDate>
		<dc:creator>Jeff Mather</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[File Formats]]></category>
		<category><![CDATA[Fodder for Techno-weenies]]></category>

		<guid isPermaLink="false">http://jeffmatherphotography.com/dispatches/2008/04/beyond-jpeg/</guid>
		<description><![CDATA[This dispatch is a bit of a valediction for me. Since early 2000, I&#8217;ve been one of the software engineers on the Image and Scientific Data Formats team at The MathWorks. I&#8217;ve learned a lot about an area of technical &#8230; <a href="http://jeffmatherphotography.com/dispatches/2008/04/beyond-jpeg/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>This dispatch is a bit of a valediction for me.  Since early 2000, I&#8217;ve been one of the software engineers on the Image and Scientific Data Formats team at The MathWorks.  I&#8217;ve learned a lot about an area of technical computing that rarely gets the limelight, which is too bad since file format support is the <i>sine qua none</i> for modern computing. As with any real-world discipline, communication and sharing data are the bases of getting anything done.  Along the way I&#8217;ve also gained a lot of skills creating code, designing systems, and managing projects.  And I&#8217;ve worked with some wonderful people.  It&#8217;s been a really great experience, but an offer that was too good to refuse came along.</p>
<p>So now I&#8217;m a Senior Software Engineer in the Image Processing Group at The MathWorks.  I still work in the same group with the same great people; only the projects have changed.  Instead of programming file format interfaces, I&#8217;ll be working on software architecture and optimization.  It&#8217;s definitely a growth opportunity for me, and I get to keep using a lot of the skills that I&#8217;ve gained over the last eight years.</p>
<p>But when you change jobs without changing offices, sometimes there&#8217;s a bit of overlap.  And my interests haven&#8217;t changed radically; I just have more.  So perhaps it&#8217;s not surprising that I&#8217;m still writing about JPEG here.  Anyway .&nbsp;.&nbsp;. on with the show.</p>
<p></p>
<p>The most popular page on this web site covers the <a href="http://jeffmatherphotography.com/dispatches/2008/01/the-jpeg-family-circus/">JPEG landscape</a>.  Believe it .&nbsp;.&nbsp;. or not.  I&#8217;m not complaining.  I just find it amusing that on a web site that touches on travel, photography, and (sometimes) software engineering, the most popular pages are about either the technical aspects of JPEG file formats or high dynamic range imaging.  I guess that&#8217;s the price I pay for writing about the dozens of things that interest me.</p>
<p>Well, the JPEG family article has been gathering some <a href="http://jeffmatherphotography.com/dispatches/2008/01/the-jpeg-family-circus/#comments">really good comments</a>.  Most recently &#8220;pixpush&#8221; commented on a proprietary extension to classic JPEG that supports high dynamic range (HDR) and wide gamut imagery.  And then he/she mused that JPEG would be even better if it could somehow support RAW data.</p>
<p>It <i>would</i> be great, but it&#8217;s never going to happen.*</p>
<p>Classic JPEG &mdash; the original JPEG that makes up all of our images &mdash; is what we might call &#8220;venerable.&#8221;  There&#8217;s nothing really wrong with it.  In fact, it&#8217;s very, very capable.  But it&#8217;s an old dog with only so many tricks left in it.  Unfortunately, the following things needed for RAW support are not part of its bag of tricks**:</p>
<ul>
<li>Lossless compression, which you absolutely need for so-called RAW imagery</li>
<li>More than 8 bits per color component, since most cameras&#8217; A2D converters use 10+ bits</li>
<li>Wider gamuts than sRGB, which would require some combination of the following: converting to and from something other than YCbCr, using signed data, or somehow specifying the colorspace</li>
</ul>
<p>It&#8217;s possible to put classic JPEG through its paces to do this, probably using the ill-supported lossless codec and extensions in new JPEG markers.  (JPEG is a stream-oriented format &mdash; unlike TIFF &mdash; so you have to parse the stream for &#8220;markers&#8221; to find where new parts begin, making it hard to jump to &#8220;interesting&#8221; parts of the file.)  But once you start making classic JPEG jump through those flaming hoops, you might as well go with one of the newer versions.</p>
<p>It&#8217;s unlikely that JPEG will &#8220;die out&#8221; in my lifetime.  As long as there is data in a format it&#8217;s never really dead.  (Unless, of course, no one knows what it means or <a href="http://www.jetset.nl/lostformats/01.html">the media dies</a>.)  But what format would I choose to replace it?</p>
<p>First, I&#8217;ll answer the question of what formats I like:</p>
<ul>
<li><b>TIFF</b>. As long there are file systems that look like the ones we have today &mdash; files as sequential collections of bytes &mdash; the almost infinite extensibility of the Tagged Image File Format will be useful.  You can put almost any kind of metadata into it now, and it&#8217;s user extensible (more or less).  It supports a limitless number of samples per pixel, any bit depth you&#8217;d like, many colorspaces, ICC profiles, and a flotilla of compression modes.  It&#8217;s also the basis of some vary capable formats such as DNG, and its data layout is used in EXIF, HD Photo/JPEG-XR, and other formats.  TIFF and cockroaches will inherit the earth.</li>
<li><b>DNG</b>.  Okay, so it&#8217;s more of a TIFF-based platform for describing RAW imagery than a traditional file format.  You need to know how to interpret the format contents in order to get a viewable image, perhaps using a program like Adobe Camera RAW.  Consequently, it&#8217;s possible for two applications to render the image quite differently.  This is a very un-JPEG-like idea, but it brings back the flexibility and creativity of real-world negatives.</li>
<li><b>JPEG-XR/HD Photo</b>.  I&#8217;ve <a href="http://jeffmatherphotography.com/dispatches/2008/01/microsoft-hd-photo/">written about this format before</a>.  It&#8217;s the heir-apparent to classic JPEG.  And that&#8217;s not just because it&#8217;s from Microsoft.</li>
<li><b>DICOM</b>.  Okay, okay.  It has a lot of flaws.  I mean, it can change byte order (endianness) in the same file . . . more than once.  That&#8217;s messed up.  To truly understand why it&#8217;s a good format, you&#8217;d have to be a trained professional, like me <strike><a href="http://www.imdb.com/title/tt0105488/quotes">or federation president Barry Fife</a></strike>.</li>
<li><b>HDF5</b>. If you absolutely must store gigabytes of data using datatypes that you define, arbitrary metadata, and multiple datasets organized in a hierarchical file structure, this is your format.  Of course, you&#8217;ll need to use an API to access your data, but you&#8217;re payin&#8217; the cost to be the boss.</li>
</ul>
<p>No one format that will replace JPEG, but I fully expect that a small collection of semi-standardized formats (JPEG-XR, DNG, TIFF) are going to fill the ever-growing image space that it doesn&#8217;t support well.  And we haven&#8217;t even touched on HDR.  There isn&#8217;t a standard HDR format yet, and there&#8217;s a lot to work left to do.  (I&#8217;m really curious to see whether the &#8220;standard&#8221; HDR image format will include a preferred tone mapping method or whether it will just be a platform for imagery like DNG.)</p>
<p>Let&#8217;s see where the future takes file formats and me.&nbsp;.&nbsp;.&nbsp;. Stay tuned.</p>
<p></p>
<p>* &mdash; Except maybe as a joke or programming assignment.</p>
<p>** &mdash; Can you mix dog and cat metaphors like that?</p>
]]></content:encoded>
			<wfw:commentRss>http://jeffmatherphotography.com/dispatches/2008/04/beyond-jpeg/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>The JPEG Family Circus</title>
		<link>http://jeffmatherphotography.com/dispatches/2008/01/the-jpeg-family-circus/</link>
		<comments>http://jeffmatherphotography.com/dispatches/2008/01/the-jpeg-family-circus/#comments</comments>
		<pubDate>Mon, 07 Jan 2008 22:27:09 +0000</pubDate>
		<dc:creator>Jeff Mather</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[File Formats]]></category>
		<category><![CDATA[Fodder for Techno-weenies]]></category>
		<category><![CDATA[Photography]]></category>

		<guid isPermaLink="false">http://jeffmatherphotography.com/dispatches/2008/01/the-jpeg-family-circus/</guid>
		<description><![CDATA[The discussion in the comments of my recent article about HD Photo (a.k.a. JPEG-XR) got me thinking about all of the different beasts that go by the name &#8220;JPEG.&#8221; JPEG: What most of us consider to be &#8220;JPEG&#8221; is just &#8230; <a href="http://jeffmatherphotography.com/dispatches/2008/01/the-jpeg-family-circus/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>The discussion in the comments of my <a href="http://jeffmatherphotography.com/dispatches/2008/01/microsoft-hd-photo/">recent article about HD Photo</a> (a.k.a. JPEG-XR) got me thinking about all of the different beasts that go by the name &#8220;JPEG.&#8221;</p>
<p><b>JPEG</b>: What most of us consider to be &#8220;JPEG&#8221; is just one of many processes for image encoding and decoding defined within the same specification. The process that makes up 99.99999% of all of the JPEGs ever created is &#8220;JPEG Baseline (Process 1)&#8221; for 8-bit lossy compression. (That&#8217;s just my estimate, which is probably low. It&#8217;s probably better to say &#8220;almost 100% of all JPEGs.&#8221;)</p>
<p>This process divides an image into a bunch of 8&#215;8 blocks, uses the discrete cosine transform (DCT) to move the data into the frequency domain, and compresses the data by (among other things) removing some of the high frequency data that the human visual system usually can&#8217;t detect. You can think about it as abridging a novel by taking out a few sentences per paragraph. Unfortunately, if the quality settings are too low, it&#8217;s really easy to notice that something has gone missing; or if a scene has a lot of information &mdash; such as one with lots of fine detail &mdash; there will be blocky artifacts where there the detail should be.</p>
<p>While the removal of high-frequency detail is inherently lossy, even with a maximum quality setting, the original JPEG standard specified a separate lossless mode not based on the DCT. Images compressed this way can be completely retrieved from the compressed data. This is important when you need to preserve all of the data within in image or when adding artifacts can have devastating consequences. &#8220;Is that a nodule in the patient&#8217;s chest X-ray or a JPEG compression artifact? I guess we&#8217;d better do a biopsy just in case&#8230;.&#8221; In fact, the lossless modes for JPEG are really only used within DICOM files, the format used for digital imaging and communications in medicine.</p>
<p>Old school JPEG also supports 12 and 16 bits of data in each channel of a pixel. For color images, this is the difference between about 17 million colors for an 8-bit image, 68 billion colors for a 12-bit image, and 281 trillion colors when using 16 bits. Once again, only those medical imaging people use the extra bit depths, and they just use the gray colors.</p>
<p><b>JPEG-LS</b> was supposed to be a better lossless format but never really got going. The promises of JPEG 2000 probably had a lot to do with this.</p>
<p><b>JPEG 2000</b> is (1) a wavelet-based compression method, (2) a scheme for encoding wavelet-compressed images into randomly accessable &#8220;codestreams&#8221;, and (3) a file format for encapsulating compressed codestreams. Because it uses a discrete wavelet transform (DWT) the results are generally better than the older JPEG format when comparing images with the same compression ratio.</p>
<p>Images in JPEG 2000 can have an arbitrary bit depth (1 &#8211; 32 bps), and different planes can have different bit depths. (For example the luminance channel of a YCbCr image can have a high bit depth to support HDR imagery.) Certain portions of an image can have higher spatial resolution or be encoded at a different compression level. JPEG 2000 has both lossy and lossless components as part of the baseline. Several colorspaces are supported, including bi-level, grayscale, sRGB, YCbCr, and indexed imagery. Hyperspectral and n-sample images are supported using a somewhat convoluted &#8220;multi-component&#8221; schema. Images can also include alpha channels for transparency. A really amazing thing about JPEG 2000 is that its possible to reorder the parts of the codestream to change how the data is accessed (e.g. access regions faster v. access different resolutions faster) without decompressing and recompressing the data, which can be expensive.</p>
<p>The JPEG 2000 file format uses about 20 hierarchical &#8220;boxes&#8221; to nest metadata about the compressed codestreams. While the file format is technically unnecessary to read and process a JPEG 2000 image, the extra formatting facilitates random data access, long-term cataloguing and IP management, and efficient transmission. JPEG 2000 files can also contain a limited subset of ICC color profiles. EXIF metadata support is not part of the JPEG 2000 standard, although it can appear as a private metadata field.</p>
<p>JPEG 2000 was touted as the format to replace the 1991 JPEG standard, but this didn&#8217;t happen for several reasons. Perhaps most important, the algorithms at the heart of JPEG 2000 require a lot of processing power, making it slower for desktop computers than rendering old-school JPEG and prohibitive for many embedded devices. As of 2007, few Web browsers have built-in support for it, and consumer-level digital cameras don&#8217;t produce imagery in the format. In 2007, Adobe Photoshop CS3 stopped including the JPEG 2000 export module in a typical installation.</p>
<p>But because of the smaller file size, flexibility, and more pleasing artifact appearance, the medical and remote sensing communities have adopted it. Both NITF and DICOM have incorporated JPEG 2000 data into their files. NITF is the friendly format used for &#8220;national imagery.&#8221; I will let you Google that so the NSA can start tracking you.</p>
<p><b>JPEG-XR</b> is the name that Microsoft&#8217;s HD Photo format might have if it&#8217;s standardized, which I sincerely hope it will be. JPEG-XR uses a <strike>principal components</strike> <i>photo core</i> transform (PCT) which I know absolutely nothing about but which promises equivalent performance to JPEG 2000 with lower computational complexity &mdash; which means you can put it on a consumer device more easily &mdash; and much better size-versus-quality performance compared to the original JPEG format. It also supports more bit depths, <a href="http://jeffmatherphotography.com/dispatches/2007/09/deconstructing-an-image/">high dynamic range imagery</a>, lossy and lossless encoding/decoding using the same algorithm, and wide gamut color; uses a linear light gamma making it possibly suitable to replace RAW formats or enable post-CRT workflows; and can store bucketloads of metadata including EXIF and XMP.</p>
<p><b>JPEG-Plus</b>. And then there&#8217;s JPEG+, which you might reasonably call JPEG &#8211; 20% because it&#8217;s essentially the same as the original DCT-based JPEG with a modest file-size performance improvement and some claims about better visual appearance. I&#8217;m not holding my breath for it; but given the 29+ processes that made up the original JPEG standard, what&#8217;s an extra one that no one will implement?</p>
<p><b>Update</b>: For posterity, the PCT stands for &#8220;Photo Core Transform&#8221; not &#8220;Principal Component Transform&#8221;.  Thomas Richter said this about it on <a href="http://groups.google.com/group/sci.image.processing/browse_thread/thread/b85b983920339d21#" title="sci.image.processing - Microsoft HD">sci.image.processing</a>:</p>
<blockquote><p>The transform is an overlapped 4&#215;4 block transform that is related to a traditional DCT scheme, or at least approximates it closely. The encoding is a simple adaptive huffman with a move-to-front list defining the scanning order, and an inter-block prediction for the DC and the lowest-frequency AC path of the transformation.</p>
<p>Some parts are really close to H264 I-frame compression, i.e. the idea to use a pyramidal transformation scheme and transform low-passes again (here with the same, in H264 with a simpler transformation).</p>
<p>The good part is that lossy and lossless use the same transformation.  The bad part is that the quantizer is the same for all frequencies, meaning there is no CSF adaption, and the entropy coder back-end is not state of the art.</p>
</blockquote>
<p><b>Update 3-February-2009:</b> JPEG-XR has advanced to &#8220;draft standard balloting,&#8221; which means it&#8217;s very likely it will become a standard (unless everyone hates it, of course). <a href="http://blogs.msdn.com/billcrow/archive/2009/01/28/jpeg-xr-for-digital-cameras-nears-completion.aspx" title="Bill Crow's Digital Imaging &#038; Photography Blog: JPEG XR for Digital Cameras Nears Completion">More info&#8230;</a></p>
]]></content:encoded>
			<wfw:commentRss>http://jeffmatherphotography.com/dispatches/2008/01/the-jpeg-family-circus/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Microsoft HD Photo</title>
		<link>http://jeffmatherphotography.com/dispatches/2008/01/microsoft-hd-photo/</link>
		<comments>http://jeffmatherphotography.com/dispatches/2008/01/microsoft-hd-photo/#comments</comments>
		<pubDate>Wed, 02 Jan 2008 21:47:34 +0000</pubDate>
		<dc:creator>Jeff Mather</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[File Formats]]></category>
		<category><![CDATA[Fodder for Techno-weenies]]></category>
		<category><![CDATA[Photography]]></category>

		<guid isPermaLink="false">http://jeffmatherphotography.com/dispatches/2008/01/microsoft-hd-photo/</guid>
		<description><![CDATA[For my 9-5, I wrangle file formats, making it possible for people to read images into MATLAB and then do something useful with them. I&#8217;m always on the look out for what&#8217;s new, and there&#8217;s always something new. (I only &#8230; <a href="http://jeffmatherphotography.com/dispatches/2008/01/microsoft-hd-photo/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>For my 9-5, I wrangle file formats, making it possible for people to read images into MATLAB and then do something useful with them. I&#8217;m always on the look out for what&#8217;s new, and there&#8217;s always something new. (I only joke about employing a legion of developers in Bangalore to develop new formats so that I can stay employed.) Lately I&#8217;ve been looking at Microsoft&#8217;s <a href="http://blogs.msdn.com/billcrow/default.aspx">HD Photo</a> format (a.k.a. Windows Media Photo, or WMP, format).</p>
<p>Once upon a time you could count on Microsoft&#8217;s image file formats to suck. Just take BMP as an example: a more-or-less unpublished format that changed between revisions of Windows and which you were supposed to access primarily through Windows API calls. File I/O people (like me) needed to have the bits of information that we could crib together handy so that we could figure out what datatype a given piece of data should have because the file didn&#8217;t carry that information in it.  Nor was there a data dictionary to programmitcally look it up. They couldn&#8217;t have designed it worse if they had tried; but they clearly didn&#8217;t do any (good) design on the format. My professional opinion: totally sucked.</p>
<p>But perhaps they&#8217;ve learned the error of their ways with their new HD Photo format, which they have submitted for standardization. It appears that they&#8217;ve talked to a lot of digital image users about their needs and given a lot of thought to the format. I&#8217;m about halfway through <a href="http://download.microsoft.com/download/a/f/d/afdfd50d-6eb9-425e-84e1-b4085a80e34e/CLN-T374_WH07.pptx">the WinHEC 2007: HD Photo Implementation Guidelines presentation</a>, which certainly says all the right things.</p>
<p>My only worry is that they&#8217;re saying <b>so many</b> of the right things that they&#8217;re possibly heading toward a couple of bad outcomes. Let&#8217;s call these the unintended consequences of designing a &#8220;good&#8221; format:</p>
<ol>
<li>The &#8220;Kitchen Sink Problem&#8221; (a.k.a. DICOM) &mdash; Everybody gets what they want, including things of questionable utility like CMYK + alpha and 5-6-5 or 10-10-10 RGB encoding. The problem is that format readers and writers have to decide what to support or use somebody else&#8217;s (possibly license-encumbered) code to read a simple file. The imperfect solution to this &mdash; which is what DICOM, NITF, and TIFF have done &mdash; is to create a mechanism for specifying compliance level.</li>
<li>The &#8220;Infinite Configurability Problem&#8221; (a.k.a. JPEG-2000) &mdash; As a file I/O developer, you either have to take everything that comes your way or you have to tell pick what you think is important and hope that you pick the right set of features. HD Photo supports settings for chroma subsampling, overlap processing, frequency or spatial decoding priority, alpha interleaved by pixel or as a separate component, spatial transformations, gamut-management as part of the format, etc. These all have their place, but it might be a bit much. And JPEG-2000 has shown us that when you make a format too smart, it takes a long time to get adopted.</li>
</ol>
<p>This brings me &#8217;round to a thought I&#8217;ve had a lot recently. Image file formats have become platforms for working with data rather than containers for communicating images. The important thing is to get the data down on disk as quickly as possible and then change the interpretation of the pixels later. It looks like HD Photo might be self-contained, but some formats (like DNG or various HDR formats, including HD Photo) don&#8217;t include all of the information about how the image should look. If you add Adobe&#8217;s special sauce while reading the image in the DNG file, you&#8217;ll get a beautifully rendered image. If you use another vendor&#8217;s tools or provide other raw conversion settings, you will likely get an image that looks rather different. Similarly, there is no one correct rendering of an HDR image, and I don&#8217;t know of any formats that specify a preferred way of tone mapping.</p>
<p>It&#8217;s a brave new world of image file formats. We file format developers live in interesting times.</p>
]]></content:encoded>
			<wfw:commentRss>http://jeffmatherphotography.com/dispatches/2008/01/microsoft-hd-photo/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
	</channel>
</rss>

