While I was at Grinnell, I wrote a lot of papers. More than 70 by my count. Over time I learned to embrace the maxim that “all good writing is rewriting.”
Now, as a software engineer I write proposals, technical specifications, documentation and (of course) e-mails. All of these require writing, editing, proofreading and rewriting. This shouldn’t surprise anyone.
What you may find surprising is this: Writing software follows the same pattern. Write, edit, proofread and rewrite. No matter what the creative endeavor, no one should expect to get the final product on the first pass. And writing software is a fundamentally creative activity. Usually, it takes several revisions to get “good code.” (The exact number depends on the target audience and how much you want to impress them.)
When writing software — as opposed to prose — I tend to look at the following things when editing my own code:
- Are the statements/commands in the most logical order for reading the code?
- Can I hide distracting details in another procedure, module or datatype?
- Can I be more terse without obfuscating the code’s meaning?
- Are there more advanced language features I can use to hide complexity?
- How can I change this routine to make it the “right size?”
- What assumptions do I need to make explicit?
- What exceptional situations have I not handled?
- How can I improve the names of variables and functions within the module?
I find that editing code works best when I treat it as if I were editing prose. That is, I print it out, get out a red pen and mark the copy. I need to be able to make notes, strike things out, bunch things together and jot the outline of code I want to add. I don’t want to get bogged down actually making the changes while I’m being critical, yet when I’m done I have something like a checklist that I can use while rewriting.
Here’s an example of some recent rewriting of a feature that’s probably going into MATLAB sometime in the next six months. Because we haven’t shipped it to customers, I feel compelled to fuzz out parts that might let you figure out exactly what it does. But hopefully there’s enough detail there for you to see the evolution of the code.
It’s also worth noting that the underlying functional design hardly changed between the first and last versions. The changes were meant to clarify the design and facilitate maintenance and enhancements. But it’s not true refactoring, since I did change some of the functionality between project iterations. (That almost always happened independently of rewriting, though. Rewrites should be behavior preserving. Then add new features, submit to source control and edit/rewrite.)
The first two scans below show the first and fourth revisions of the file; they’re pretty heavily marked. The last scan is the current/final version. Click any image for a more readable version.