<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Jeff Mather&#039;s Dispatches &#187; Fodder for Techno-weenies</title>
	<atom:link href="http://jeffmatherphotography.com/dispatches/category/fodder-for-techno-weenies/feed/" rel="self" type="application/rss+xml" />
	<link>http://jeffmatherphotography.com/dispatches</link>
	<description>The Post-9-to-5 Life of an International Playboy</description>
	<lastBuildDate>Wed, 23 May 2012 13:23:38 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.1</generator>
		<item>
		<title>Convolution</title>
		<link>http://jeffmatherphotography.com/dispatches/2012/02/office-conversation/</link>
		<comments>http://jeffmatherphotography.com/dispatches/2012/02/office-conversation/#comments</comments>
		<pubDate>Mon, 27 Feb 2012 21:07:56 +0000</pubDate>
		<dc:creator>Jeff Mather</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[Fodder for Techno-weenies]]></category>
		<category><![CDATA[General]]></category>
		<category><![CDATA[Life Lessons]]></category>

		<guid isPermaLink="false">http://jeffmatherphotography.com/dispatches/?p=4464</guid>
		<description><![CDATA[Coworker: So, what are you doing? [Eying my copy of Steve's Digital Image Processing Using MATLAB book.] Me: I thought it was about time for me to learn how filtering works. Coworker: Everybody has to walk through that convolution and &#8230; <a href="http://jeffmatherphotography.com/dispatches/2012/02/office-conversation/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><i><b>Coworker:</b></i> So, what are you doing? [Eying my copy of <a href="http://blogs.mathworks.com/steve/">Steve</a>'s <i><a href="http://www.imageprocessingplace.com/DIPUM-2E/dipum2e_main_page.htm">Digital Image Processing Using MATLAB</a></i> book.]</p>
<p><i><b>Me:</b></i> I thought it was about time for me to learn how filtering works.</p>
<p><i><b>Coworker:</b></i> Everybody has to walk through that convolution and correlation forest at some point and come out the other side as a man.</p>
]]></content:encoded>
			<wfw:commentRss>http://jeffmatherphotography.com/dispatches/2012/02/office-conversation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Writing a File Reader in MATLAB</title>
		<link>http://jeffmatherphotography.com/dispatches/2012/02/writing-a-file-reader-in-matlab/</link>
		<comments>http://jeffmatherphotography.com/dispatches/2012/02/writing-a-file-reader-in-matlab/#comments</comments>
		<pubDate>Wed, 22 Feb 2012 14:53:24 +0000</pubDate>
		<dc:creator>Jeff Mather</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[Fodder for Techno-weenies]]></category>
		<category><![CDATA[MATLAB]]></category>

		<guid isPermaLink="false">http://jeffmatherphotography.com/dispatches/?p=4437</guid>
		<description><![CDATA[A colleague recently asked me to help him read a file in MATLAB, which supports reading a whole bunch of image and scientific data formats right out-of-the-box but not NRRD. This format stores 3D volumes of radiology data and (like &#8230; <a href="http://jeffmatherphotography.com/dispatches/2012/02/writing-a-file-reader-in-matlab/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>A colleague recently asked me to help him read a file in MATLAB, which supports reading a whole bunch of image and scientific data formats right out-of-the-box but not NRRD. This format stores 3D volumes of radiology data and (like FITS) contains a text header containing key-value pairs followed by a binary payload. Having written file parsers full-time for the better part of ten years, it didn&#8217;t take too long for me to create <a href="http://www.mathworks.com/matlabcentral/fileexchange/34653-nrrd-format-file-reader/content/nrrdread.m">a .nrrd file reader for MATLAB</a>.</p>
<p>I&#8217;m kind of proud of this little feature for its simplicity, and it shows a lot of the power of MATLAB. In fewer than 200 lines of well-structured code, I was able to implement a robust file reader.  Here are a few features it uses that anyone creating their own file reader in MATLAB might also try to take advantage of:</p>
<p><br clear="all" /><b><tt>assert</tt></b> &mdash; Stop writing <tt>if</tt> blocks that only exist to check whether everything is okay and error if it isn&#8217;t.</p>
<pre class="brush: matlabkey; title: ; notranslate">fid = fopen(filename, 'rb');
assert(fid &gt; 0, 'Could not open file.');</pre>
<p>And&nbsp;.&nbsp;.&nbsp;.</p>
<pre class="brush: matlabkey; title: ; notranslate">assert(isfield(meta, 'sizes') &amp;&amp; ...
       isfield(meta, 'dimension') &amp;&amp; ...
       isfield(meta, 'encoding') &amp;&amp; ...
       isfield(meta, 'endian'), ...
       'Missing required metadata fields.')</pre>
<p><br clear="all" /><b><tt>onCleanup</tt></b> &mdash; Why worry about trying to remember to clean up resources? Let the <tt>onCleanup</tt> class take care of it for you. Construct one of these objects by giving it an anonymous function that closes your file handle when the object goes out of scope&mdash;whether from an error or at the end of the function.</p>
<pre class="brush: matlabkey; title: ; notranslate">cleaner = onCleanup(@() fclose(fid));</pre>
<p><br clear="all" /><b><tt>regexp</tt></b> &mdash; Use MATLAB&#8217;s regular expression engine to handle complicated text parsing for you.</p>
<pre class="brush: matlabkey; title: ; notranslate">theLine = fgetl(fid);

% &quot;fieldname:= value&quot; or &quot;fieldname: value&quot; or &quot;fieldname:value&quot;
parsedLine = regexp(theLine, ':=?\s*', 'split', 'once');</pre>
<p><br clear="all" /><b>Dynamic structure field indexing</b> &mdash; If you have a string that&#8217;s a legal MATLAB identifier, there&#8217;s no need to write complicated logic just to use it as a field name in a structure. Simply use the <tt>.(string)</tt> construct.</p>
<pre class="brush: matlabkey; title: ; notranslate">field = lower(parsedLine{1});
value = parsedLine{2};

field(isspace(field)) = '';  % Remove embedded spaces.
meta(1).(field) = value;</pre>
<p><br clear="all" /><b>Using temporary files to decompress data</b> &mdash; The NRRD format supports storing the image data as raw bytes, human readable text, or GZIP-compressed byte streams. When a file contains compressed or encapsulated data and MATLAB has a file reader capable of handling that, it&#8217;s easiest just to write the data to a temporary file and use the supported reader. Consider the <tt>readData()</tt> subfunction that recursively handles three different kinds of encoding:</p>
<pre class="brush: matlabkey; title: ; notranslate">function data = readData(fidIn, meta, datatype)

switch (meta.encoding)
 case {'raw'}

  data = fread(fidIn, inf, [datatype '=&gt;' datatype]);

 case {'gzip', 'gz'}

  tmpBase = tempname();
  tmpFile = [tmpBase '.gz'];
  fidTmp = fopen(tmpFile, 'wb');
  assert(fidTmp &gt; 3, 'Could not open temporary file for GZIP decompression')

  tmp = fread(fidIn, inf, 'uint8=&gt;uint8');
  fwrite(fidTmp, tmp, 'uint8');
  fclose(fidTmp);

  gunzip(tmpFile)

  fidTmp = fopen(tmpBase, 'rb');
  cleaner = onCleanup(@() fclose(fidTmp));

  meta.encoding = 'raw';
  data = readData(fidTmp, meta, datatype);

 case {'txt', 'text', 'ascii'}

  data = fscanf(fidIn, '%f');
  data = cast(data, datatype);

 otherwise
  assert(false, 'Unsupported encoding')
end</pre>
<p><br clear="all" /><b><tt>swapbytes</tt></b> &mdash; Like many formats, NRRD supports little-endian and big-endian byte ordering. The <tt>swapbytes</tt> function makes it dead simple to change endianness, and the <tt>computer</tt> function helps you determine whether swapping is necessary. Here&#8217;s the pattern, which uses the &#8220;endian&#8221; metadata value read from the .nrrd file:</p>
<pre class="brush: matlabkey; title: ; notranslate">function data = adjustEndian(data, meta)

[~,~,endian] = computer();

needToSwap = (isequal(endian, 'B') &amp;&amp; ...
               isequal(lower(meta.endian), 'little')) || ...
             (isequal(endian, 'L') &amp;&amp; ...
               isequal(lower(meta.endian), 'big'));

if (needToSwap)
    data = swapbytes(data);
end</pre>
<p><br clear="all" />Happy coding!</p>
]]></content:encoded>
			<wfw:commentRss>http://jeffmatherphotography.com/dispatches/2012/02/writing-a-file-reader-in-matlab/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>An Experiment with SSE</title>
		<link>http://jeffmatherphotography.com/dispatches/2012/02/an-experiment-with-sse/</link>
		<comments>http://jeffmatherphotography.com/dispatches/2012/02/an-experiment-with-sse/#comments</comments>
		<pubDate>Sat, 04 Feb 2012 22:15:29 +0000</pubDate>
		<dc:creator>Jeff Mather</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[Fodder for Techno-weenies]]></category>
		<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://jeffmatherphotography.com/dispatches/?p=4368</guid>
		<description><![CDATA[Updated: 6 February 2012. Techie people, the good stuff&#8212;code, results, more info&#8212;is below the fold. Friends, I&#8217;ve been in my head a bit recently. That&#8217;s not necessarily bad&#8212;it&#8217;s a nice neighborhood, really&#8212;but one of the dark alleys I&#8217;ve had to &#8230; <a href="http://jeffmatherphotography.com/dispatches/2012/02/an-experiment-with-sse/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><i>Updated: 6 February 2012.</i></p>
<p><b>Techie people, the good stuff&mdash;code, results, more info&mdash;is below the fold.</b></p>
<p><br clear="all" />Friends, I&#8217;ve been in my head a bit recently. That&#8217;s not necessarily bad&mdash;it&#8217;s a nice neighborhood, really&mdash;but one of the dark alleys I&#8217;ve had to walk past a lot lately involves way too many <a href="http://www.nytimes.com/2008/02/05/health/05mind.html">imposter-type feelings</a>. As I previously mentioned, we&#8217;re hiring, and we&#8217;re looking for someone to do many of the same job tasks that I do. Being mostly self-taught at software engineering and computer science, I have to remind myself that I have more than a dozen years of experience; whereas, for most of the people whose résumés cross my desk, they do not.</p>
<p>It&#8217;s sometimes hard to silence those voices that say &#8220;everyone else is more accomplished than you.&#8221; Even though it&#8217;s not true, I&#8217;ve been meaning to pick up some more skills so that I can (a) try to feel like less of an imposter and (b) write more awesome code to make our product more awesome and help us fend off our competitors and make more loot.</p>
<p>So yesterday afternoon, rather than coming up with (yet another) daunting list of all of the places where I feel like I should learn more, I just picked one that I already knew about: Streaming SIMD Extensions, <i>a.k.a.</i> SSE. For me, the best way to learn is to write a program that does something (theoretically) useful, run into real-life obstacles, and workaround the pitfalls I encountered. Practice, practice, practice.</p>
<p><br clear="all" /><b>My dear readers who don&#8217;t program, you can now safely navigate away for another day without guilt. The rest of the post is rather technical and describes how I took an example I found online, made it work with <tt>gcc</tt>, and got a 4x speedup by using SSE. A speedup and happier outlook. Not bad for a few hours on a Saturday!</b></p>
<p><span id="more-4368"></span></p>
<p><br clear="all" />The first real examples I found online were <a href="http://supercomputingblog.com/optimization/getting-started-with-sse-programming/">this post</a> from The Supercomputing Blog and <a href="http://www.liranuna.com/sse-intrinsics-optimizations-in-popular-compilers/">this one</a> from LiraNuna. Together they got me started.</p>
<p>The program computes <i><tt>sqrt(x)/x</tt></i> many times and compares the results and performance. On my 2006-era CoreDuo MacBook Pro, I got these results, when evaluating this equation over the range 0 to 64,000 and computing all of the values 10,000 times:</p>
<pre>Without SSE: 26.404 seconds
With SSE, using division: 6.705 seconds (3.9x faster)
With SSE, using reciprocals: 4.204 seconds (6.3x faster)</pre>
<p>The example code below uses the following functions and datatypes that were new to me:</p>
<pre>_aligned_malloc
posix_memalign

__m128

_mm_add_ps
_mm_div_ps
_mm_mul_ps
_mm_rcp_ps
_mm_set_ps
_mm_set1_ps
_mm_sqrt_ps</pre>
<p>I compiled, ran, and timed the code using these commands:</p>
<pre style="font-size:85%">g++ -O2 -msse sqrtxdivx.cpp -o sqrtdivx_ssediv
g++ -O2 -msse sqrtxdivx.cpp -o sqrtdivx_sserecip
g++ -O2 -msse sqrtxdivx.cpp -o sqrtdivx_ssenone
time ./sqrtdivx_ssediv ; time ./sqrtdivx_sserecip ; time ./sqrtdivx_ssenone</pre>
<p>For more info on SSE:</p>
<ul>
<li><a href="http://webster.cs.ucr.edu/AoA/Windows/HTML/TheMMXInstructionSet.html">Art of Assembly Language book</a> &mdash; See chapter 11.</li>
<li><a href="http://download.intel.com/support/processors/pentiumii/sb/24319002.pdf">Intel Architecture Software Developer&#8217;s Manual (PDF)</a> &mdash; See chapter 9.</li>
<li><a href="http://en.wikipedia.org/wiki/Streaming_SIMD_Extensions">Wikipedia&#8217;s SSE page</a></li>
</ul>
<p>Without more delay, the code!</p>
<pre style="font-size:70%">
<div class="codesnip-container" >
<div class="c codesnip" style="font-family:monospace;"><span class="co2">#include &lt;stdio.h&gt; &nbsp; &nbsp; &nbsp;// printf()</span>
<span class="co2">#include &lt;xmmintrin.h&gt; &nbsp;// SSE compiler intrinsics</span>
<span class="co2">#include &lt;stdlib.h&gt; &nbsp; &nbsp; // posix_memalign()</span>
<span class="co2">#include &lt;math.h&gt; &nbsp; &nbsp; &nbsp; // Non-SSE sqrt()</span>

<span class="co1">//#define USE_SSE &nbsp;// Define this if you want to run with SSE.</span>
<span class="co1">//#define USE_DIV &nbsp; // Define this if you want to use SSE division (slower, accurate).</span>


<span class="co1">// We will be calculating Y = sqrt(x) / x, for x = 1 to 64000</span>

<span class="kw4">int</span> main<span class="br0">&#40;</span><span class="kw4">int</span> argc<span class="sy0">,</span> <span class="kw4">char</span><span class="sy0">*</span> argv<span class="br0">&#91;</span><span class="br0">&#93;</span><span class="br0">&#41;</span>
<span class="br0">&#123;</span>
&nbsp; <span class="co1">// Compute sqrt(x)/x for the first 64,000 nonnegative integers.</span>
&nbsp; <span class="kw4">const</span> <span class="kw4">int</span> length <span class="sy0">=</span> <span class="nu0">64000</span><span class="sy0">;</span>
&nbsp; 
&nbsp; <span class="co1">// float *pResult = (float*) _aligned_malloc(length * sizeof(float), 16); &nbsp;/* MSVC */</span>
&nbsp; <span class="kw4">float</span> <span class="sy0">*</span>pResult<span class="sy0">;</span>
&nbsp; posix_memalign<span class="br0">&#40;</span>reinterpret_cast<span class="sy0">&lt;</span>void <span class="sy0">**&gt;</span><span class="br0">&#40;</span><span class="sy0">&amp;</span>pResult<span class="br0">&#41;</span><span class="sy0">,</span> 16<span class="sy0">,</span> length <span class="sy0">*</span> <span class="kw4">sizeof</span><span class="br0">&#40;</span><span class="kw4">float</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="sy0">;</span> &nbsp;<span class="coMULTI">/* gcc */</span>

&nbsp; <a href="http://www.opengroup.org/onlinepubs/009695399/functions/printf.html"><span class="kw3">printf</span></a><span class="br0">&#40;</span><span class="st0">&quot;Starting calculation...<span class="es1">\n</span>&quot;</span><span class="br0">&#41;</span><span class="sy0">;</span>

&nbsp; <span class="kw1">for</span> <span class="br0">&#40;</span><span class="kw4">int</span> stress <span class="sy0">=</span> <span class="nu0">0</span><span class="sy0">;</span> stress <span class="sy0">&lt;</span> <span class="nu0">10000</span><span class="sy0">;</span> <span class="sy0">++</span>stress<span class="br0">&#41;</span>
&nbsp; <span class="br0">&#123;</span>
&nbsp; &nbsp; <span class="co2">#ifdef USE_SSE</span>
&nbsp; &nbsp; &nbsp; __m128 <span class="sy0">*</span>pResultSSE <span class="sy0">=</span> reinterpret_cast<span class="sy0">&lt;</span>__m128<span class="sy0">*&gt;</span><span class="br0">&#40;</span>pResult<span class="br0">&#41;</span><span class="sy0">;</span> &nbsp;<span class="co1">// Intermediate pointer</span>

&nbsp; &nbsp; &nbsp; __m128 x <span class="sy0">=</span> _mm_set_ps<span class="br0">&#40;</span>4.0f<span class="sy0">,</span> 3.0f<span class="sy0">,</span> 2.0f<span class="sy0">,</span> 1.0f<span class="br0">&#41;</span><span class="sy0">;</span> &nbsp;<span class="co1">// Initialize x to &lt;1,2,3,4&gt;.</span>
&nbsp; &nbsp; &nbsp; __m128 xDelta <span class="sy0">=</span> _mm_set1_ps<span class="br0">&#40;</span>4.0f<span class="br0">&#41;</span><span class="sy0">;</span> &nbsp;<span class="co1">// Set the xDelta to &lt;4,4,4,4&gt;.</span>

&nbsp; &nbsp; &nbsp; <span class="kw4">const</span> <span class="kw4">int</span> sseLength <span class="sy0">=</span> length <span class="sy0">/</span> <span class="nu0">4</span><span class="sy0">;</span>
&nbsp; &nbsp; &nbsp; <span class="kw1">for</span> <span class="br0">&#40;</span><span class="kw4">int</span> i <span class="sy0">=</span> <span class="nu0">0</span><span class="sy0">;</span> i <span class="sy0">&lt;</span> sseLength<span class="sy0">;</span> <span class="sy0">++</span>i<span class="br0">&#41;</span>
&nbsp; &nbsp; &nbsp; <span class="br0">&#123;</span>
&nbsp; &nbsp; &nbsp; &nbsp; __m128 xSqrt <span class="sy0">=</span> _mm_sqrt_ps<span class="br0">&#40;</span>x<span class="br0">&#41;</span><span class="sy0">;</span>

&nbsp; &nbsp; &nbsp; &nbsp; <span class="co2">#ifdef USE_DIV</span>
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="co1">// Use slower, more accurate SSE division.</span>
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; pResultSSE<span class="br0">&#91;</span>i<span class="br0">&#93;</span> <span class="sy0">=</span> _mm_div_ps<span class="br0">&#40;</span>xSqrt<span class="sy0">,</span> x<span class="br0">&#41;</span><span class="sy0">;</span>
&nbsp; &nbsp; &nbsp; &nbsp; <span class="co2">#endif</span>
&nbsp; &nbsp; &nbsp; &nbsp; <span class="co2">#ifndef USE_DIV</span>
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="co1">// Use faster, less accurate reciprocal and multiply.</span>
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; __m128 xRecip <span class="sy0">=</span> _mm_rcp_ps<span class="br0">&#40;</span>x<span class="br0">&#41;</span><span class="sy0">;</span>
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; pResultSSE<span class="br0">&#91;</span>i<span class="br0">&#93;</span> <span class="sy0">=</span> _mm_mul_ps<span class="br0">&#40;</span>xRecip<span class="sy0">,</span> xSqrt<span class="br0">&#41;</span><span class="sy0">;</span>
&nbsp; &nbsp; &nbsp; &nbsp; <span class="co2">#endif</span>

&nbsp; &nbsp; &nbsp; &nbsp; x <span class="sy0">=</span> _mm_add_ps<span class="br0">&#40;</span>x<span class="sy0">,</span> xDelta<span class="br0">&#41;</span><span class="sy0">;</span> &nbsp;<span class="co1">// Advance x to the next set of numbers.</span>
&nbsp; &nbsp; &nbsp; <span class="br0">&#125;</span>
&nbsp; &nbsp; <span class="co2">#else &nbsp;// USE_SSE</span>
&nbsp; &nbsp; &nbsp; <span class="kw4">float</span> xFloat <span class="sy0">=</span> <span class="nu17">1.0f</span><span class="sy0">;</span>
&nbsp; &nbsp; &nbsp; <span class="kw1">for</span> <span class="br0">&#40;</span><span class="kw4">int</span> i <span class="sy0">=</span> 0 <span class="sy0">;</span> i <span class="sy0">&lt;</span> length<span class="sy0">;</span> <span class="sy0">++</span>i<span class="br0">&#41;</span>
&nbsp; &nbsp; &nbsp; <span class="br0">&#123;</span>
&nbsp; &nbsp; &nbsp; &nbsp; pResult<span class="br0">&#91;</span>i<span class="br0">&#93;</span> <span class="sy0">=</span> sqrt<span class="br0">&#40;</span>xFloat<span class="br0">&#41;</span> <span class="sy0">/</span> xFloat<span class="sy0">;</span>
&nbsp; &nbsp; &nbsp; &nbsp; xFloat <span class="sy0">+=</span> <span class="nu17">1.0f</span><span class="sy0">;</span>
&nbsp; &nbsp; &nbsp; <span class="br0">&#125;</span> &nbsp; &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; <span class="co2">#endif // USE_SSE</span>
&nbsp; <span class="br0">&#125;</span>

&nbsp; <span class="co1">// To prove that the program actually worked, look at the first 20.</span>
&nbsp; <span class="kw1">for</span> <span class="br0">&#40;</span><span class="kw4">int</span> i <span class="sy0">=</span> <span class="nu0">0</span><span class="sy0">;</span> i <span class="sy0">&lt;</span> <span class="nu0">20</span><span class="sy0">;</span> <span class="sy0">++</span>i<span class="br0">&#41;</span>
&nbsp; <span class="br0">&#123;</span>
&nbsp; &nbsp; <a href="http://www.opengroup.org/onlinepubs/009695399/functions/printf.html"><span class="kw3">printf</span></a><span class="br0">&#40;</span><span class="st0">&quot;Result[%d] = %f<span class="es1">\n</span>&quot;</span><span class="sy0">,</span> i<span class="sy0">,</span> pResult<span class="br0">&#91;</span>i<span class="br0">&#93;</span><span class="br0">&#41;</span><span class="sy0">;</span>
&nbsp; <span class="br0">&#125;</span>
<span class="br0">&#125;</span></div>
</div>
</pre>
<p>The raw output. Notice how the first (SSE with division) and the third (no SSE instructions) gave the same answer, while the second set of results (using SSE and a reciprocal calculation) gave decidedly less accurate computations.</p>
<pre style="font-size:85%">Starting calculation...
Result[0] = 1.000000
Result[1] = 0.707107
Result[2] = 0.577350
Result[3] = 0.500000
Result[4] = 0.447214
Result[5] = 0.408248
Result[6] = 0.377964
Result[7] = 0.353553
Result[8] = 0.333333
Result[9] = 0.316228
Result[10] = 0.301511
Result[11] = 0.288675
Result[12] = 0.277350
Result[13] = 0.267261
Result[14] = 0.258199
Result[15] = 0.250000
Result[16] = 0.242536
Result[17] = 0.235702
Result[18] = 0.229416
Result[19] = 0.223607

real	0m6.705s
user	0m6.678s
sys	0m0.008s

Starting calculation...
Result[0] = 0.999756
Result[1] = 0.706934
Result[2] = 0.577209
Result[3] = 0.499878
Result[4] = 0.447104
Result[5] = 0.408149
Result[6] = 0.377872
Result[7] = 0.353467
Result[8] = 0.333252
Result[9] = 0.316151
Result[10] = 0.301470
Result[11] = 0.288605
Result[12] = 0.277282
Result[13] = 0.267196
Result[14] = 0.258136
Result[15] = 0.249939
Result[16] = 0.242469
Result[17] = 0.235645
Result[18] = 0.229365
Result[19] = 0.223552

real	0m4.204s
user	0m4.196s
sys	0m0.007s

Starting calculation...
Result[0] = 1.000000
Result[1] = 0.707107
Result[2] = 0.577350
Result[3] = 0.500000
Result[4] = 0.447214
Result[5] = 0.408248
Result[6] = 0.377964
Result[7] = 0.353553
Result[8] = 0.333333
Result[9] = 0.316228
Result[10] = 0.301511
Result[11] = 0.288675
Result[12] = 0.277350
Result[13] = 0.267261
Result[14] = 0.258199
Result[15] = 0.250000
Result[16] = 0.242536
Result[17] = 0.235702
Result[18] = 0.229416
Result[19] = 0.223607

real	0m26.404s
user	0m26.360s
sys	0m0.033s</pre>
<p><b><i>Update &mdash; 6 February 2012:</i></b> On Sunday I wanted to see what would happen when I changed the code to work on 64-bit, double-precision floating-point numbers. While I saw the expected 4x speedup when operating on four 32-bit floats at once, I did not see the same 2x improvement with doubles. The SSE and no-SSE versions each took approximately 26 seconds. Why?</p>
<p>It&#8217;s either the processor or an OS issue. When I applied the same changes to the code at work, I saw not only a 2x speed up for doubles, I also got an almost 8x speed up for single-precision. That was surprising.  When I looked at the differences between the Linux assembly of the &#8220;float&#8221; and &#8220;double&#8221; programs, the only things different were the obvious: minor array size and offset differences and the presence of the &#8220;_pd&#8221; versions of the routines instead of the &#8220;_ps&#8221; ones. This was similar to the differences I remember seeing when I compared assemblies built on my MacBook Pro.</p>
<p>The major differences between the two environments:</p>
<ul>
<li>32-bit MacOS 10.6 <i>versus</i> 64-bit Debian 6 Linux (running under VMware)</li>
<li>gcc 4.2.something <i>versus</i> gcc 4.4.5</li>
<li>An older Intel CoreDuo processor (ca. 2006) <i>versus</i> an 8-core 3.07 GHz Intel Xeon (late 2009)</li>
</ul>
<p>Clearly streaming SIMD instructions make a difference for some types, but the effect is limited (or helped) by the processor that executes those instructions. (Of course, I still have to check that GCC or the OS aren&#8217;t part of the difference.</p>
]]></content:encoded>
			<wfw:commentRss>http://jeffmatherphotography.com/dispatches/2012/02/an-experiment-with-sse/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Welcome to Herb Sutter&#8217;s Jungle</title>
		<link>http://jeffmatherphotography.com/dispatches/2012/01/welcome-to-herb-sutters-jungle/</link>
		<comments>http://jeffmatherphotography.com/dispatches/2012/01/welcome-to-herb-sutters-jungle/#comments</comments>
		<pubDate>Mon, 30 Jan 2012 15:41:03 +0000</pubDate>
		<dc:creator>Jeff Mather</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[Fodder for Techno-weenies]]></category>
		<category><![CDATA[Software Engineering]]></category>

		<guid isPermaLink="false">http://jeffmatherphotography.com/dispatches/?p=4341</guid>
		<description><![CDATA[In an effort to keep posting something here until I&#8217;m in the right place mentally to write about things that probably interest you, my dear friends, family, and online diabetes peeps, here&#8217;s another computing performance excerpt and link. (Working on &#8230; <a href="http://jeffmatherphotography.com/dispatches/2012/01/welcome-to-herb-sutters-jungle/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>In an effort to keep posting something here until I&#8217;m in the right place mentally to write about things that probably interest you, my dear friends, family, and online diabetes peeps, here&#8217;s another computing performance excerpt and link. (Working on this stuff is the 9-5 part of your favorite international playboy&#8217;s life.)</p>
<p><br clear="all" />A half-decade after Herb Sutter wrote that <a href="http://drdobbs.com/architecture-and-design/184405990">the &#8220;free lunch&#8221; of Moore&#8217;s Law is over</a>, he&#8217;s back with his prophet&#8217;s wisdom about where we&#8217;re going in his January Dr. Dobbs article, <a href="http://drdobbs.com/parallel/232400273">&#8220;Welcome to the Jungle&#8221;</a>. I&#8217;ll give you a moment to decide whether to get the Guns N&#8217; Roses song out of your head or use it as a backdrop for this juicy quotation:</p>
<blockquote><p>If hardware designers merely use Moore&#8217;s Law to deliver more big fat cores, on-device hardware parallelism will stay in double digits for the next decade, which is very roughly when Moore&#8217;s Law is due to sputter, give or take about a half decade. If hardware follows Niagara&#8217;s and MIC&#8217;s lead to go back to simpler cores, we&#8217;ll see a one-time jump and then stay in triple digits. If we all learn to leverage GPUs, we already have 1,500-way parallelism in modern graphics cards (I&#8217;ll say &#8220;cores&#8221; for convenience, though that word means something a little different on GPUs) and likely reach five digits in the decade timeframe.</p>
<p>But all of that is eclipsed by the scalability of the cloud, whose growth line is already steeper than Moore&#8217;s Law because we&#8217;re better at quickly deploying and using cost-effective networked machines than we&#8217;ve been at quickly jam-packing and harnessing cost-effective transistors. It&#8217;s hard to get data on the current largest cloud deployments because many projects are private, but the largest documented public cloud apps (which don&#8217;t use GPUs) are already harnessing over 30,000 cores for a single computation. I wouldn&#8217;t be surprised if some projects are exceeding 100,000 cores today. And that&#8217;s general-purpose cores; if you add GPU-capable nodes to the mix, add two more zeroes.</p>
</blockquote>
<p><a href="http://jeffmatherphotography.com/dispatches/2012/01/welcome-to-herb-sutters-jungle/herb13/" rel="attachment wp-att-4343"><img src="http://jeffmatherphotography.com/dispatches_wp/wp-content/uploads/2012/01/herb13-800x450.gif" alt="" title="Scalability of different architectures" width="640" height="360" class="alignleft size-large wp-image-4343" /></a></p>
<p>The big takeaway for software engineers like me is that we&#8217;d best be learning how to develop solutions using the emerging APIs so that we can harness all of those extra orders of magnitude of scalability. That involves figuring out how to&nbsp;.&nbsp;.&nbsp;.</p>
<ul>
<li>Deal with the processor axis&#8217; lower section [of Sutter's chart] by supporting compute cores with different performance (big/fast, slow/small).</li>
<li>Deal with the processor axis&#8217; upper section by supporting language subsets, to allow for cores with different capabilities including that not all fully support mainstream language features.</li>
<li>Deal with the memory axis for computation, by providing distributed algorithms that can scale not just locally but also across a compute cloud.</li>
<li>Deal with the memory axis for data, by providing distributed data containers, which can be spread across many nodes.</li>
<li>Enable a unified programming model that can handle the entire [memory/locality/processor] chart with the same source code.</li>
</ul>
<blockquote><p>Perhaps our most difficult mental adjustment, however, will be to learn to think of the cloud as part of the mainstream machine — to view all these local and non-local cores as being equally part of the target machine that executes our application, where the network is just another bus that connects us to more cores. That is, in a few years we will write code for mainstream machines assuming that they have million-way parallelism, of which only thousand-way parallelism is guaranteed to always be available (when out of WiFi range).&nbsp;.&nbsp;.&nbsp;.</p>
<p>If you haven&#8217;t done so already, now is the time to take a hard look at the design of your applications, determine what existing features — or better still, what potential and currently unimaginable demanding new features — are CPU-sensitive now or are likely to become so soon, and identify how those places could benefit from local and distributed parallelism. Now is also the time for you and your team to grok the requirements, pitfalls, styles, and idioms of hetero-parallel (e.g., GPGPU) and cloud programming (e.g., Amazon Web Services, Microsoft Azure, Google App Engine).</p>
</blockquote>
<p><br clear="all" />p.s.&nbsp;&mdash;&nbsp;I can&#8217;t believe that it&#8217;s been almost four years since I took a course with Herb out in Washington. That was some <a href="http://jeffmatherphotography.com/dispatches/2008/05/traveling-again/">hard-core learnin&#8217;</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://jeffmatherphotography.com/dispatches/2012/01/welcome-to-herb-sutters-jungle/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>We Need a New Mindset</title>
		<link>http://jeffmatherphotography.com/dispatches/2012/01/we-need-a-new-mindset/</link>
		<comments>http://jeffmatherphotography.com/dispatches/2012/01/we-need-a-new-mindset/#comments</comments>
		<pubDate>Fri, 27 Jan 2012 18:53:19 +0000</pubDate>
		<dc:creator>Jeff Mather</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[Fodder for Techno-weenies]]></category>
		<category><![CDATA[Software Engineering]]></category>

		<guid isPermaLink="false">http://jeffmatherphotography.com/dispatches/?p=4331</guid>
		<description><![CDATA[Guy Steele drops a truth bomb. (From How to Think about Parallel Programming: Not!)]]></description>
			<content:encoded><![CDATA[<p>Guy Steele drops a truth bomb.</p>
<p><a href="http://jeffmatherphotography.com/dispatches/2012/01/we-need-a-new-mindset/steele/" rel="attachment wp-att-4332"><img src="http://jeffmatherphotography.com/dispatches_wp/wp-content/uploads/2012/01/steele-660x500.png" alt="" title="Guy Steele - We Need a New Mindset" width="640" height="484" class="alignleft size-large wp-image-4332" /></a></p>
<p>(From <a href="http://www.infoq.com/presentations/Thinking-Parallel-Programming">How to Think about Parallel Programming: Not!</a>)</p>
]]></content:encoded>
			<wfw:commentRss>http://jeffmatherphotography.com/dispatches/2012/01/we-need-a-new-mindset/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Thinking Differently about Software Optimization</title>
		<link>http://jeffmatherphotography.com/dispatches/2012/01/thinking-differently-about-software-optimization/</link>
		<comments>http://jeffmatherphotography.com/dispatches/2012/01/thinking-differently-about-software-optimization/#comments</comments>
		<pubDate>Thu, 26 Jan 2012 21:11:22 +0000</pubDate>
		<dc:creator>Jeff Mather</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[Fodder for Techno-weenies]]></category>
		<category><![CDATA[From the Yellow Notepad]]></category>
		<category><![CDATA[Software Engineering]]></category>

		<guid isPermaLink="false">http://jeffmatherphotography.com/dispatches/?p=4302</guid>
		<description><![CDATA[Yesterday morning while eating my &#8220;Free Wednesday Breakfast&#8221; chocolate croissant and fresh fruit with yoghurt, I watched an interview with John Nolan entitled &#8220;The State of Hardware Acceleration with GPUs/FPGAs, Parallel Algorithm Design.&#8221; In the spirit of giving back, I&#8217;m &#8230; <a href="http://jeffmatherphotography.com/dispatches/2012/01/thinking-differently-about-software-optimization/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Yesterday morning while eating my &#8220;Free Wednesday Breakfast&#8221; chocolate croissant and fresh fruit with yoghurt, I watched an interview with John Nolan entitled &#8220;<a href="http://www.infoq.com/interviews/nolan-hardware-acceleration">The State of Hardware Acceleration with GPUs/FPGAs, Parallel Algorithm Design</a>.&#8221; In the spirit of giving back, I&#8217;m posting a few notes.</p>
<ul>
<li>When optimizing code for GPU, FPGA, or CPU, definitely focus on pipelining and overall throughput, not just local optimizations.</li>
<li>There&#8217;s a trade-off between &#8220;faster&#8221; and &#8220;sooner.&#8221; It&#8217;s not always worth saving a few seconds (or even a few minutes) if the kernels take hours or days to compile. (Then again, sometimes it is.)</li>
<li>Try to reduce dependence on the language/compiler &#8220;stack&#8221; that removes inefficiencies. The optimizer does good work, but you can do things to help it. Think about the hardware or architecture format. It&#8217;s not a sin to reduce the amount of abstraction in the service of performance. Pay attention to things that affect processor pipelining and cache movement.</li>
<li>BTW, some languages and technologies exist to provide higher level programming that&#8217;s close to the hardware, but they&#8217;re proprietary, secret, or still in R&#038;D.</li>
<li>Use algorithmic optimization techniques. Step back and find the shortest-time computation.</li>
<li>Avoid using <tt>if</tt> statements. The <tt>goto</tt> construct is considered harmful, but <tt>if</tt> is basically the same thing. Instead think about state machines and polymorphism. There&#8217;s no branch-prediction penalty to pay, since the system &#8220;just is&#8221; in the state it&#8217;s supposed to be in. The logic is clearer, because there are no switches, making it easier to test, too.</li>
<li>Don&#8217;t always assume that floating-point values are necessary. Integers can often be creatively used and are far faster for math than double-precision numbers.</li>
<li>Of course, there&#8217;s a compromise between speedy/efficient and readable/maintainable.</li>
<li>Aim to structure programs as &#8220;symbolic intent.&#8221; Mathematical descriptions are bad ways of expressing programs. Think about functional programming models instead of procedural.</li>
</ul>
<p>If you want to know more, you should definitely watch the half-hour interview. And if your reaction was more along the lines of <i>&#8220;Yes, yes; that&#8217;s all true, and it&#8217;s how I design my image processing code,&#8221;</i> then I definitely hope you&#8217;ll consider <a href="http://jeffmatherphotography.com/dispatches/2012/01/now-hiring-image-processing-software-engineers/">applying for the GPU/multicore</a> engineering position we have open.</p>
]]></content:encoded>
			<wfw:commentRss>http://jeffmatherphotography.com/dispatches/2012/01/thinking-differently-about-software-optimization/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>QCon SF 2011 Software Engineering Conference Notes</title>
		<link>http://jeffmatherphotography.com/dispatches/2011/12/qcon-sf-2011-software-engineering-conference-notes/</link>
		<comments>http://jeffmatherphotography.com/dispatches/2011/12/qcon-sf-2011-software-engineering-conference-notes/#comments</comments>
		<pubDate>Thu, 29 Dec 2011 14:21:32 +0000</pubDate>
		<dc:creator>Jeff Mather</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[Fodder for Techno-weenies]]></category>
		<category><![CDATA[From the Yellow Notepad]]></category>
		<category><![CDATA[Software Engineering]]></category>

		<guid isPermaLink="false">http://jeffmatherphotography.com/dispatches/?p=4235</guid>
		<description><![CDATA[It&#8217;s sometimes possible to forget when reading all of the posts here about travel, diabetes, triathlon, and photography that they&#8217;re just a small part of my life. I have a job to which I devote a whole lot more time. &#8230; <a href="http://jeffmatherphotography.com/dispatches/2011/12/qcon-sf-2011-software-engineering-conference-notes/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s sometimes possible to forget when reading all of the posts here about <a href="http://jeffmatherphotography.com/dispatches/category/travel/">travel</a>, <a href="http://jeffmatherphotography.com/dispatches/category/diabetes/">diabetes</a>, <a href="http://jeffmatherphotography.com/dispatches/category/reluctant-triathlete/">triathlon</a>, and <a href="http://jeffmatherphotography.com/dispatches/category/photography/">photography</a> that they&#8217;re just a small part of my life. I have a job to which I devote a whole lot more time. I don&#8217;t talk about it much because (a) discussing what I&#8217;m working on putting into the <a href="http://www.mathworks.com/products/image/">Image Processing Toolbox</a> isn&#8217;t appropriate or allowed, and even if it were (b) talking shop probably isn&#8217;t that interesting to most of the people here. But&mdash;believe it or not&mdash;the majority of traffic to my site lands on the pages that are technical, so I don&#8217;t feel so bad about posting the random <a href="http://jeffmatherphotography.com/dispatches/category/fodder-for-techno-weenies/">&#8220;fodder for techno-weenies&#8221;</a> post. (It&#8217;s a term of endearment, I promise! :^)</p>
<p>This is another one of those posts. Every year between Christmas and New Years Day, I try to use the quiet week to get stuff done and tie up loose ends. Last year, <a href="http://jeffmatherphotography.com/dispatches/2010/12/requirements-again/">I cleared out a bunch of notes</a>. This year, I&#8217;m looking at presentations and slides from the <a href="http://qconsf.com/sf2011/">QCon SF 2011 conference</a> (<a href="http://www.infoq.com/articles/QCon-San-Francisco-2011">wrap-up</a>). Its focus on software architecture and project management is about 75% of my job, so many of the presentations seemed tailor-made for me. Here&#8217;s some of what I learned.</p>
<p><br clear="all" /><a href="http://qconsf.com/dl/qcon-sanfran-2011/slides/ErikDoernenburg_SoftwareQualityYouKnowItWhenYouSeeIt.pdf">Erik Doernenburg. &#8220;Software Quality: You Know It When You See It&#8221;</a> has a really good slide deck that got me thinking about some projects I might want to set up. It&#8217;s full of practical, usable suggestions:</p>
<ul>
<li>View the code at the <a href="http://97things.oreilly.com/wiki/index.php/Get_the_1000ft_view">1,000 view</a>, rather than ground-level or 30,000 feet.</li>
<li>Look at the test-to-code ratio, not just code coverage.</li>
<li>Graph the change of metrics between versions and revisions, compare across different parts of the code, and look at them relative to industry standards.</li>
<li><a href="http://erik.doernenburg.com/2008/11/how-toxic-is-your-code/">Measure the &#8220;toxicity&#8221; of code</a> by rolling up various quality metrics about a bunch of modules into stacked bar charts.</li>
</ul>
<p>We should pose these questions during design and code reviews:</p>
<ul>
<li>Is the software/change of value to its users?</li>
<li>How appropriate is the design?</li>
<li>How easy is the code/design to understand and extend?</li>
<li>How maintainable is the software?</li>
</ul>
<p>It was full of some really great links to things like <a href="http://erik.doernenburg.com/2010/05/metrics-tree-maps/">Metrics tree maps</a> (a.k.a., pretty heatmaps for source code) as well as a few tools: <a href="http://www.campwoodsw.com/sourcemonitor.html">SourceMonitor</a>, <a href="http://loose.upt.ro/reengineering/research/iplasma">iPlasma</a>, and <a href="http://erik.doernenburg.com/2009/07/moose-mse-for-java-and-cs/">using Moose to visualize quality</a>.</p>
<p><br clear="all" /><a href="http://qconsf.com/dl/qcon-sanfran-2011/slides/JoshuaKerievsky_RefactoringToPatterns.pdf">Joshua Kerievsky. &#8220;Refactoring to Patterns&#8221;</a> &mdash; some notes:</p>
<ul>
<li>Refactoring is like algebra&#8217;s equivalence-preserving manipulations. &#8220;Design patterns are the word problems of the programming world; refactoring is its algebra.&#8221;</li>
<li>Understanding the refactoring thought process is more important than remembering individual techniques or tool support.</li>
<li>Code smells have multiple refactoring options and often benefit from composite refactorings.</li>
<li>Look for automatable refactorings first. Consider changing the client of smelly code before the smelly code itself.</li>
</ul>
<p><br clear="all" /><a href="http://qconsf.com/dl/qcon-sanfran-2011/slides/GuilhermeSilveira_HowToStopWritingNextYearsUnsustainablePieceOfCode.pdf">Guilherme Silveira. &#8220;How To Stop Writing Next Year&#8217;s Unsustainable Piece Of Code&#8221;</a> was pithy and thought-provoking.</p>
<ul>
<li>There is no value for architecture or design without implementation. That&#8217;s just interpretation of the software.</li>
<li>&#8220;New language. New mindset. new idiomatic usage. Same mistakes.&#8221;</li>
<li>Complexity and composition are natural and good, but if they&#8217;re invisible, they&#8217;re evil.</li>
<li>Start with a mess and refactor right away. Starting &#8220;right&#8221; is hard (and <a href="http://jeffmatherphotography.com/dispatches/2008/09/all-good-writing-is-rewriting/">misguided thinking</a>). Refactor for <i>better</i>, not just prettier.</li>
<li>Make complexity easier to understand and see.</li>
<li>Hiding complexity in concision hurts testability, since no one knows the complexity is there. Furthermore, if it&#8217;s hard to test, it&#8217;s also hard to use correctly.</li>
<li>&#8220;Model rules. Do not model models.&#8221;</li>
</ul>
<p><br clear="all" /><a href="http://qconsf.com/dl/qcon-sanfran-2011/slides/MichaelFeathers_SoftwareNaturalismEmbracingTheRealBehindTheIdeal.pdf">Michael Feathers. &#8220;Software Naturalism: Embracing The Real Behind The Ideal&#8221;</a> is a presentation that I would like to see/hear, since the slides seemed full of information but weren&#8217;t self-explanatory. Here are two things I could glean: 80% of software defects in large projects were in 20% of the files. In general, the more churn in a file, the more complex it tends to be.</p>
<p><br clear="all" /><a href="http://www.infoq.com/presentations/Panel-Objects-On-Trial">Panel: &#8220;Objects on Trial&#8221;</a> was perhaps the most unusual presentation, since it was a mock-trial. I use objects all the time&nbsp;.&nbsp;.&nbsp;. some of them are good&nbsp;.&nbsp;.&nbsp;. some <a href="http://jeffmatherphotography.com/dispatches/2008/11/surveying-quality-in-object-oriented-design/">demonstrably so</a>. Even so, I never latched onto the idea of object-oriented (OO) design versus objects as types. The four panelists, in one way or another, basically said, &#8220;That&#8217;s the problem.&#8221;</p>
<p>One of the panelists drew an extended analogy between the space program and OO. The space shuttle (which we all love) was fixated on reuse but basically was a waste of heavy lifting; people don&#8217;t reuse the right stuff. In software, object reuse is largely accomplished by cut-and-paste copying of boilerplate code that does close to what you want. Of course, the panelist acknowledged that we do reuse the ideas in OO via design patterns, and no one seems to have much of a problem with that. Ironically, having a rich pattern language means that software engineers are in a better place than ever before to use objects correctly.</p>
<p>A key problem with our approach to objects is that we&#8217;ve failed (generally in software engineering) to handle complexity well, which was supposed to be the point of OO design. A conflation of beauty and OO design makes things worse. Internally, software is ugly, and beauty shouldn&#8217;t be a goal. Making a fetish of beauty makes code inflexible because people don&#8217;t want to extend the beautiful thing that works.</p>
<p>For other panelists, objects weren&#8217;t the problem at all. For them it&#8217;s static typing in &#8220;OO languages,&#8221; such as C++, Java, and C#. We&#8217;re at a place now where all of the good things about OO have been lost in an attempt to make OO languages as fast as C. This runs counter to the goal of having &#8220;ordinary,&#8221; understandable code. Generic programming using strongly typed (possibly template heavy) languages just makes everything complicated.</p>
<p>For me, it&#8217;s moot. C++ is what I use, and I don&#8217;t have a large proprietary object system that I can tap into for reuse. I&#8217;m in the camp that uses C++ objects to generate new types for data hiding and aggregation, as well as (to a lesser extent) reuse. But some of these types are generic, template classes that are hard to understand. I plead &#8220;no contest.&#8221;</p>
]]></content:encoded>
			<wfw:commentRss>http://jeffmatherphotography.com/dispatches/2011/12/qcon-sf-2011-software-engineering-conference-notes/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Heart Rate Training?</title>
		<link>http://jeffmatherphotography.com/dispatches/2011/11/heart-rate-training/</link>
		<comments>http://jeffmatherphotography.com/dispatches/2011/11/heart-rate-training/#comments</comments>
		<pubDate>Wed, 30 Nov 2011 03:44:27 +0000</pubDate>
		<dc:creator>Jeff Mather</dc:creator>
				<category><![CDATA[Cycling]]></category>
		<category><![CDATA[Data-betes]]></category>
		<category><![CDATA[Fodder for Techno-weenies]]></category>
		<category><![CDATA[NaBloPoMo]]></category>
		<category><![CDATA[NaBloPoMo 2011]]></category>

		<guid isPermaLink="false">http://jeffmatherphotography.com/dispatches/?p=4084</guid>
		<description><![CDATA[How do you get faster at any endurance activity? Ironically, you get faster by doing it faster than usual. If you run every run at one pace or do every ride at the same tempo, then you&#8217;ll never progress. You &#8230; <a href="http://jeffmatherphotography.com/dispatches/2011/11/heart-rate-training/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>How do you get faster at any endurance activity? Ironically, you get faster by doing it faster than usual. If you run every run at one pace or do every ride at the same tempo, then you&#8217;ll never progress. You can only build up so much aerobic capacity, since you can only move so much blood and oxygen around. What you need to do is to work harder so that the muscles themselves are stronger and capable of giving more.</p>
<p>My running plan includes plenty of tempo running and interval sessions. And I&#8217;ve finally gotten to the point where there&#8217;s &#8220;normal swimming&#8221; and &#8220;harder swimming.&#8221; But how do I know how hard to work when cycling?</p>
<p>I think the answer is heart rate training, which is new to me. Have any of you had success doing this?</p>
<p>I&#8217;ve figured out several of the basic calculations based on my <i>computed maximum heart rate</i> (183 bpm) and <i>resting heart rate</i> (52 bpm). According to <a href="http://www.marathonguide.com/fitnesscalcs/heartrate1calc.cfm">an online calculator</a>, these are my <i>target heart rate zones</i>:</p>
<pre>Zone 1: 118-131
Zone 2: 131-144
Zone 3: 144-157 (Aerobic)
Zone 4: 157-170 (Anaerobic)
Zone 5: 170-183 (Maximal)</pre>
<p>Now, where do I go from here?</p>
]]></content:encoded>
			<wfw:commentRss>http://jeffmatherphotography.com/dispatches/2011/11/heart-rate-training/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Just Give Me the Dorky Helmet and I Will Be &#8220;That Guy&#8221;</title>
		<link>http://jeffmatherphotography.com/dispatches/2011/11/just-give-me-the-dorky-helmet-and-i-will-be-that-guy/</link>
		<comments>http://jeffmatherphotography.com/dispatches/2011/11/just-give-me-the-dorky-helmet-and-i-will-be-that-guy/#comments</comments>
		<pubDate>Wed, 23 Nov 2011 04:00:17 +0000</pubDate>
		<dc:creator>Jeff Mather</dc:creator>
				<category><![CDATA[Cycling]]></category>
		<category><![CDATA[Fodder for Techno-weenies]]></category>
		<category><![CDATA[General]]></category>
		<category><![CDATA[NaBloPoMo]]></category>
		<category><![CDATA[NaBloPoMo 2011]]></category>
		<category><![CDATA[Reluctant Triathlete]]></category>

		<guid isPermaLink="false">http://jeffmatherphotography.com/dispatches/?p=3946</guid>
		<description><![CDATA[In order not to bury the lede, I&#8217;m going to start with the big news: I put down a deposit on a new bike today, which I will pick up (hopefully) on Monday after a final fitting session. Here it &#8230; <a href="http://jeffmatherphotography.com/dispatches/2011/11/just-give-me-the-dorky-helmet-and-i-will-be-that-guy/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>In order not to bury the lede, I&#8217;m going to start with the big news: I put down a deposit on a new bike today, which I will pick up (hopefully) on Monday after a final fitting session. Here it is:</p>
<p><a href="http://jeffmatherphotography.com/dispatches/2011/11/just-give-me-the-dorky-helmet-and-i-will-be-that-guy/my-new-cervelo-p2-tri-bike/" rel="attachment wp-att-3950"><img src="http://jeffmatherphotography.com/dispatches_wp/wp-content/uploads/2011/11/IMG_2824-666x500.jpg" alt="" title="My new Cervelo P2 tri-bike" width="640" height="480" class="alignnone size-large wp-image-3950" /></a><br clear="all" /></p>
<p>You should know a few things about my decision to get this bike:</p>
<ol>
<li>I have been saving money to buy a new bike at some point in the future ever since I bought my last bike in 2009.</li>
<li>I hadn&#8217;t expected to buy a tri-bike after just one season.</li>
<li>Lisa suggested, out of the blue, that I should get one. Merry Christmas from my sweetie (and my slush fund)!</li>
<li>I&#8217;m deeply, deeply ambivalent about this bike for so many reasons.</li>
</ol>
<p>A couple of my coworkers and I joke that &#8220;triathlon is a rich white person&#8217;s sport.&#8221; And while I&#8217;ve seen people riding all sorts of bikes and wearing all manner of kit during various triathlons, the joke is uncomfortably close to the truth. Even though swimmers can wear just about anything (although a wetsuit will give you some advantage) and running is running is running regardless of your income level, anything involving a bike&mdash;including triathlon&mdash;is going to start to seem like thoroughbred horse racing. Cycling is where all of the money is spent, often for diminishing returns</p>
<p>&#8220;$2,000 for an aero wheelset to save a couple of minutes over 112 miles? $500 to save 30 grams by switching pedals? Yes, those are just what I need to get me to Kona.&#8221; Um, right. It would be like most of us thinking we can buy our way to a Boston Qualifying time by getting better shoes. I&#8217;m not saying the advantages aren&#8217;t real or that people shouldn&#8217;t be able to spend their money in whatever race-legal way they want. It just seems that common sense goes out the window whenever our reptilian brains see anything associated with two wheels.</p>
<p>So how did I end up standing around in my bike shorts and jersey this afternoon at Landry&#8217;s bike store waiting to get fit for a super-aero time trial/triathlon bike? Believe me,  I&#8217;ve been asking myself that question and second-guessing myself and looking deep inside my athlete&#8217;s soul for the answer.</p>
<p><a href="http://jeffmatherphotography.com/dispatches/2011/11/just-give-me-the-dorky-helmet-and-i-will-be-that-guy/preparing-for-the-fit/" rel="attachment wp-att-3949"><img src="http://jeffmatherphotography.com/dispatches_wp/wp-content/uploads/2011/11/IMG_2823-666x500.jpg" alt="" title="Preparing for the fit" width="640" height="480" class="alignnone size-large wp-image-3949" /></a><br clear="all" /></p>
<p>I guess it&#8217;s the simple fact that tri-bikes are faster than the bike I have now. Not just a little bit faster. No. A lot faster. A couple years ago I was <a href="http://jeffmatherphotography.com/dispatches/2009/11/just-a-guy-with-diabetes-on-a-bike/">just a guy with diabetes on a bike</a>, but after this past year I consider myself an athlete. I feel I can be competitive in several of the races on next year&#8217;s docket, and I&#8217;d like to see how far I can go by giving my potential just a bit of help. [<a href="#3946fn1" name="3946fn1back">1</a>]</p>
<p>This is the big, up-front investment. While it&#8217;s true that you never really &#8220;just buy a bike,&#8221; I don&#8217;t plan on putting much additional loot into this one right away. I hope I&#8217;m more down-to-earth than that. After all, it&#8217;s &#8220;not about the bike.&#8221; There&#8217;s a whole lot of hard work for me to do to feel like I&#8217;m living up to the potential of this bike, and that&#8217;s a big motivator.</p>
<p><br clear="all" />Of course, maybe I&#8217;m just making a big deal out of nothing at all. Why don&#8217;t I just get on with telling you about the bike fit process.</p>
<p>When I bought my road bike (at a different store) the fitting was really simple: &#8220;Sit on the bike on a trainer while I get out my calipers and protractor and we will adjust the saddle.&#8221;</p>
<p>Today was nothing like that. Over the better part of an hour I was videotaped riding a highly reconfigurable &#8220;bike.&#8221; A specially trained bike shop guy reviewed the video frame by frame, measuring various angles, reconfiguring, retaping, remeasuring, repeating,&nbsp;.&nbsp;.&nbsp;. He used a laser level to aid in making extremely accurate measurements throughout the process.</p>
<p><a href="http://jeffmatherphotography.com/dispatches/2011/11/just-give-me-the-dorky-helmet-and-i-will-be-that-guy/my-fit-analysis/" rel="attachment wp-att-3947"><img src="http://jeffmatherphotography.com/dispatches_wp/wp-content/uploads/2011/11/IMG_2821-666x500.jpg" alt="" title="My fit analysis" width="640" height="480" class="alignnone size-large wp-image-3947" /></a><br clear="all" /></p>
<p>Having <a href="http://jeffmatherphotography.com/dispatches/2011/11/tape-delay/">recorded myself swimming</a>, it was interesting to have this applied to another event. (Now I just need someone to film me running and to analyze my stride, in order to complete my mini-series triathlon.) Turns out I have very long arms and legs and a rather short torso&mdash;which I already knew, since most of my clothes must be tailored in order to fit right. I also have very flexible hip, knee and lower back joints, so I can get into a pretty aggressive aero position. That was unexpected.</p>
<p>Forty-five minutes after stripping down to my bike clothes (which I was wearing under my regular clothes [<a href="#3946fn2" name="3946fn2back">2</a>]) I was ready to get bike recommendations. Or rather <i>one</i> recommendation. &#8220;Can we at least pretend like I comparison shopped?&#8221; I asked; so we played that charade quite convincingly. If I got that less expensive bike I would need to spend more money in accessories to get the same fit as the &#8220;right&#8221; bike. Or I could spend twice as much for the same fit with better components (like aero-er wheels and lighter pedals and shit like that.)</p>
<p>And then, the soft sell: &#8220;Do you want to take it for a ride?&#8221; Without a helmet? &#8220;We can lend you a helmet.&#8221; Outdoors in the cold? &#8220;Do you want a jacket or something? [Seriously, man, HTFU.]&#8221; I can manage without one for a few minutes.</p>
<p>It wasn&#8217;t the same pure joy that I had when I first pedaled my road bike. That feeling of effortlessness and the fluidity was replaced by wobbliness as I got used to the equivalent of driving a Porsche with a go-kart&#8217;s steering wheel. Because I was in a residential neighborhood, I didn&#8217;t really get the chance to open it up; nevertheless I could feel the responsiveness of this almost weightless bike.</p>
<p>I can hardly wait to give it a try.</p>
<p><a href="http://jeffmatherphotography.com/dispatches/2011/11/just-give-me-the-dorky-helmet-and-i-will-be-that-guy/sleeping-dogs/" rel="attachment wp-att-3948"><img src="http://jeffmatherphotography.com/dispatches_wp/wp-content/uploads/2011/11/IMG_2822-666x500.jpg" alt="" title="Sleeping dogs" width="640" height="480" class="alignnone size-large wp-image-3948" /></a><br clear="all" /></p>
<p><a name="3946fn1"></a>1&nbsp;&mdash;&nbsp;2011 really was the year when I became an athlete. I competed in nine different races, and I used a couple of training plans to prepare for them. I was actually a bit haphazard in the events I chose, and I&#8217;m trying to be a little more focused next year. Of course, I&#8217;m also trying to keep a lot of the spontaneity and love for what I&#8217;m doing, which is precisely why I rebel against most triathlon training programs. &#8220;Life is choices.&#8221; [<a href="#3946fn1back">Back&nbsp;.&nbsp;.&nbsp;</a>]</p>
<p><a name="3946fn2"></a>2&nbsp;&mdash;&nbsp; I put my new office gym membership to use for the first time today over lunch so that I could change clothes. OMG, those were some chatty guys in the locker-room with me. I thought we were genetically programmed to become mute and blind in the presence of other naked men. (It&#8217;s a mutation on the Y chromosome. That&#8217;s a fact. Look it up.) This is going to take some getting used to. [<a href="#3946fn2back">Back&nbsp;.&nbsp;.&nbsp;</a>]</p>
]]></content:encoded>
			<wfw:commentRss>http://jeffmatherphotography.com/dispatches/2011/11/just-give-me-the-dorky-helmet-and-i-will-be-that-guy/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>What I Do</title>
		<link>http://jeffmatherphotography.com/dispatches/2011/08/what-i-do/</link>
		<comments>http://jeffmatherphotography.com/dispatches/2011/08/what-i-do/#comments</comments>
		<pubDate>Fri, 19 Aug 2011 14:07:11 +0000</pubDate>
		<dc:creator>Jeff Mather</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[Fodder for Techno-weenies]]></category>

		<guid isPermaLink="false">http://jeffmatherphotography.com/dispatches/?p=2580</guid>
		<description><![CDATA[If you&#8217;ve ever wondered what I do for my 9-5, it&#8217;s figure out stuff like this and this.]]></description>
			<content:encoded><![CDATA[<p>If you&#8217;ve ever wondered what I do for my 9-5, it&#8217;s figure out stuff like <a href="http://link.aip.org/link/JEIME5/v20/i3/p033004/pdf?type=ALERT" title="Journal of Electronic Imaging: Compute-unified device architecture implementation of a block-matching algorithm for multiple graphical processing unit cards. Massanes, et al." target="_blank">this</a> and <a href="http://www.akkadia.org/drepper/dsohowto.pdf" title="How to Write Shared Libraries. Drepper." target="_blank">this</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://jeffmatherphotography.com/dispatches/2011/08/what-i-do/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

