In the spirit of kaizen (continuous improvement) and eliminating muda (waste), here is a mini failure analysis of the attack on my web site.

The effective cause of the failure was a SQL injection attack on my site, most likely caused by a worm/virus that I encountered on a public computer while using the Movable Type publishing interface. This attack led to (1) compromised HTML pages containing malicious Javascript code that led Internet Explorer to download the worm on unprotected computers, (2) compromised Movable Type template pages, and (3) malicious PHP and .htaccess pages, which redirected malformed URLs transparently to external sites of dubious legality.

A number of root causes contributed to the failure: using insecure public computers, allowing directories with world-writable (umask 777) permissions for Movable Type, not running Movable Type in a safer wrapped CGI mode, and not running PHP in “safe” mode that prevents certain kinds of file writing. I could have discovered the problem earlier by examining the server logs — I’m fortunate enough to have access to them — and by digging deeper into unexpected “file not found” behavior when I misspelled URLs (I assumed it was Mozilla automatically searching for sites on a 404 error).

The cost of the failure is hard to quantify in terms of lost opportunities, but since Google starts to treat sites with large numbers of these redirects as fradulent, there certainly must be some damage to my site’s findability. Of the roughly 580 pages Google has indexed for my domain, roughly 300-350 are not actually part of my site. This damages my credibility, both for readers who stumble across redirected pages while attempting to reach my site and for those who see spam when they search Google for me.

In terms of bandwidth and site utilization, for the first thirteen days of January there were just over 14,000 requests to my webserver, after stripping out certain 404 responses. Of these, roughly 6,900 were legitimate requests for pages, images, stylesheets, RSS feeds, etc. The remaining 7,100 were all attempts to access the bogus pages, most of which succeeded and accounted for 5.9 megabytes of bandwidth. Google and other web crawlers while cataloging these bogus sites comprised more than 10% of this traffic.

It’s unclear when this nefarious traffic will abate. In the 12 hours since I put the countermeasures in place, 250 invalid requests came through; fortunately these were all blocked, but it’s the same pace as before.