Entries Tagged as ''

comment spam resistance

Having installed the latest version of Wordpress, I lost my anti-spam customisations and it started pouring back in. Around 10 an hour.

Two paradigms on the whole comment spam are classical AI and evolutionary dynamics.

From an AI point of view it boils down to a test which humans can pass but computers fail — a kind of turing test. This is the approach taken by the CAPTCHA project which tends to frame it as a classical difficult image recognition problem. Apparently these are starting to be cracked by smarter spambots though. And humans have to really squint for some of the trickier ones.

The problem really arises because certain pieces of software are very popular so it becomes worthwhile to target them for spam. Their relative uniformity means if you can spam one, you can spam thousands. Rather than making the tasks more taxing (yet still common) I favour breaking up the monoculture and allowing the user to specify their own test. This could even be a trivial textual question (’How many toenails do 10 people have all together?‘). If you don’t get the answer right then your comment is rejected. If everyone makes up their own question then a successful spammer would need an AI that would make search engines drool.

I’ll be starting with the simplest possible test and increasing difficulty as required. Don’t think I’ll have to raise the bar too high.

unexpected linkage

After refreshing myself on a nice little evolutionary algorithm called ECGA I googled for the author, Georges Harik. Turns out he’s now director of Googlettes, the experimental lab at Google.

As a by-product, I stumbled across the blog for IlliGAL and a somewhat cheezy web radio interview.

Being able to go from reading a technical paper on probablistic models of genetic algorithms to hearing the author explain the thinking about Gmail is one example of why I’m still an excitable web-monkey.

Brum Films

Here’s a simple little thing that you might want to bookmark on your XHTML phone, or even your PC, if you’re a Brummie who likes to go to the flicks on a whim.

All it does is tell you what’s on the big cinemas in Birmingham for the remainder of today. Because that is all you usually want to know, and the alternative sources are generally a pain in the arse.

It’s generated from an ugly but parseable e-mail I get sent each week. Comments and suggestions welcome on this beta.

blog shuffle

Updated my blogging software today. Haven’t got time to learn all the ins-and-outs of the system but have started shovelling stuff over from the front page with a view to having the blog the base for the whole site.