comment spam resistance
Having installed the latest version of Wordpress, I lost my anti-spam customisations and it started pouring back in. Around 10 an hour.
Two paradigms on the whole comment spam are classical AI and evolutionary dynamics.
From an AI point of view it boils down to a test which humans can pass but computers fail — a kind of turing test. This is the approach taken by the CAPTCHA project which tends to frame it as a classical difficult image recognition problem. Apparently these are starting to be cracked by smarter spambots though. And humans have to really squint for some of the trickier ones.
The problem really arises because certain pieces of software are very popular so it becomes worthwhile to target them for spam. Their relative uniformity means if you can spam one, you can spam thousands. Rather than making the tasks more taxing (yet still common) I favour breaking up the monoculture and allowing the user to specify their own test. This could even be a trivial textual question (’How many toenails do 10 people have all together?‘). If you don’t get the answer right then your comment is rejected. If everyone makes up their own question then a successful spammer would need an AI that would make search engines drool.
I’ll be starting with the simplest possible test and increasing difficulty as required. Don’t think I’ll have to raise the bar too high.
