blogging How To Programming spam The Internet Web Design WordPress

Fighting Spam on a Diet – How to fix Akismet Performance Problems

Running into strange WordPress performance problems and database errors?  Akismet could be the culprit, but we're in luck, it's an easy fix. Earlier I wrote a bit about our encounter with vicious, robotic Chinese comment spammers.  Since then we've had a few further issues, and I think I've found the culprit - Akismet, the plugin we've been using to fight the spam. First off, let me say that I think Akismet is a great plugin.  While we had hundreds of spams come in for a few days in a row, not one made it out to the public.  Very nice.  But it is a bit too aggressive in one spot, and that can slow down your blog or lock up the comment table, filling your max_connections. The problem is in akismet.php, specifically the akismet_delete_old() function.  Look for the following lines:
$n = mt_rand(1, 5); if ( $n % 5 ) $wpdb->query("OPTIMIZE TABLE $wpdb->comments");
Those of you with PHP / MySQL experience will recognize the problem immediately.  For the less code-literate, this is creating a random number between 1 and 5, and if the number has a remainder after being divided by 5, it runs and OPTIMIZE TABLE on the comments table.  That means that at random, it will lock the entire table and compute statistics after 80% off all deletes. Now, it's a good idea to optimize your tables after a large number of deletes.  But it is a pretty expensive operation, because it could be rearranging things on disk to free up space. Now, imagine you get hit by a spam bot and end up with a couple hundred spam comments.  Akismet catches them all, and 15 days later tries to delete them all in one big loop.  One big loop filled with a couple hundred table-locking, disk-intensive database operations. But it's easy to fix.  Replace the lines above with this:
$n = mt_rand(1, 100); if ( $n == 42 ) $wpdb->query("OPTIMIZE TABLE $wpdb->comments");
That will only optimize the table on average once out of 100 comments deleted.  Why 100?  It's an educated guess.  According to the MySQL documentation, at most you will need to optimize a table once a month or so, maybe once a week if you have a large number of deletes or edits on varchar fields. Why did I pick 42 for the one value out of a hundred that triggers an optimization?  You're asking the wrong question.