Re: Here come the spammers!!!
PART 3 – STRONG -vs- WEAK METHODS
When it comes to spam on BP sites, you’ll see all sorts of stuff posted on blogs saying “change [whatever] on your site and your spam problem will disappear”.
Truthfully, a lot of these tricks will actually work …for a while… but eventually, the spammer makes a minor change to their bot, and they’re back in business. In fact, many of the leading blog spamming packages include sophisticated logging features to catch the errors that “uniquely configured” blogs generate and help the spammer quickly fix the “problem”.
If we’re going to have a reliable anti-spam solution for BuddyPress, we should probably focus on “Mathematically Strong” methods, not on “Obfuscation” and “Moving Things Around”. That way, we won’t have to constantly change our spam protection methods.
Changing Page Slugs
Many people recommend changing the page slugs on BP installations to reduce spam. While this is certainly easy to do, you of course need to give your users *links* to those page slugs somewhere on your site so they can actually visit the pages. And if users can follow the links, so can a spam bot.
Changing page slugs is kind of like boarding-up the front door of your house, installing a new door in the side of your house, and then attaching a piece of string from the front door to the side door of so everyone can find the new door.
The “change your page slugs” approach seems to come from the “change your admin menu URL” technique. Changing your admin menu URL is actually a *strong* protection technique. Since there is no link to it anywhere on the site and you’re the only one that knows the URL, it’s like having two passwords on your admin login. An attacker would have to try billions of URL’s to find it.
Not so with all the other URL’s on your site. They have to be linked off other pages so your users can find them.
Adding Fake Form Fields
Many people recommend adding a few extra fields to forms throughout your site (sign-up, login, post to group, etc) and “hiding” these fields using CSS. If any of the “trap” fields are filled out, in theory, you’ve just detected a bot, because a normal user would never see the fields and fill them out.
This approach *might* defeat a very simple bot that searches every web page it can find for forms, and fills every field in every form with random spam; but it will not defeat a bot that understands CSS or is specifically targeted at BuddyPress, especially considering that BuddyPress is *open source*.
Don’t think bots can analyze CSS? Read this: http://www.google.com/support/webmasters/bin/answer.py?answer=66353
A bot designer can simply read through the BP source code and discover the names of the fields that should be filled in and the names of the fields that should be left empty.
To use our “house” analogy, adding extra form fields is like installing 3 front doors on your house and rigging two of them with grenades …then hanging a big red “out of order” sign on the the two rigged doors so your friends don’t use them.
Obviously if your friends can read the signs, so can your enemies.
JavaScript Proof of Work
Javascript proof of work (Wp Hashcash) defeats spammers by making visitor’s web browsers solve a math problem in JavaScript before they are allowed to post.
Because everyone knows spam bots can’t run JavaScript.
http://forums.digitalpoint.com/showthread.php?t=1124949
http://www.scrapebox.com/
http://blogcommentdemon.com/
http://www.senuke.com
http://www.botmasternet.com/more1/
Except when they can.
There’s also the issue of what to do with visitors that don’t have JavaScript enabled.
The WordPress and BuddyPress development teams have put an epic amount of work into ensuring both platforms will work reliably when JavaScript isn’t available. Requiring users to have JavaScript to post any kind of content to the site nullifies much of this work.
Proof-of-work was a great idea back in 1997 when spammers ran hundreds of attack threads from a single server and solving the JavaScript math problems slowed it to a crawl.
In 1997, we’d be dealing with a single spammer running 1000 attack threads against the site. Because the spammer was running 1000 threads, each of which would have to solve the JavaScript problem, they would effectively be penalized 1000 fold over a normal user. The end result is they would only be able to run a few threads before their computer slowed to a crawl and their spamming abilities would be sharply limited.
Epic win for site.
Unfortunately, things are different in 2010.
Spam bots have become the tool of choice for basement SEO marketers. Instead of a few members of the “spam elite”, we’re dealing with tens of thousands of “do it yourself” spammers each running 1 attack thread using the new “automatic backlink software” they just picked up for $29.00 off some random SEO website. Instead of fighting one spammer splitting their resources across a thousand threads, we’re fighting a thousand spammers running a single thread dedicated *just to our site*.
Skipping a ton of math, what this means, is that in order to cause a spammer a 1-second delay while their computer solves our JavaScript challenge, we have to cause each of our *legitimate users* a 1 second delay while *their* computer solves our JavaScript challenge. And, considering the 3 to 5 second database lag I see on 90% of the BP sites I visit, the challenge would need to take much longer than a second to have any merit at all …otherwise page refresh time would be the limiting factor, not the JS challenge.
So what happens when a user visits the site using a computer that is much slower than a typical desktop …say a mobile phone or an old laptop? The challenge would take proportionally longer to complete. A challenge that requires 5 seconds to solve on a desktop PC, could take 30 seconds on an iphone …and 30 second response times would not make for an enjoyable user experience.
Overall, proof-of-work challenges are probably not a good choice in the 2010 Internet landscape.
Mathematically Strong Methods
In the next post, I’ll cover the specific details of the methods I’ve proposed for the BP spam solution, and why they will defeat most spam attacks.
^F^