hide all comments

Blog-tech

Added CAPTCHA to prevent spam comments

March 14, 2008 01:16:59 +0200 (EET)

I finally caved in to the spammers, and added a CAPTCHA test to the "Add Comment" page. I hate having to inconvenience you to prevent the idiots messing up the commons, but the truth is I don't have time to be cleaning out the spam by hand, so it's either CAPTCHA for the commenters or a mess for all readers. Sorry.

The CAPTCHA system I chose is reCAPTCHA, from Carnegie Mellon:

reCAPTCHA improves the process of digitizing books by sending words that cannot be read by computers to the Web in the form of CAPTCHAs for humans to decipher. ...But if a computer can't read such a CAPTCHA, how does the system know the correct answer to the puzzle? Here's how: Each new word that cannot be read correctly by OCR is given to a user in conjunction with another word for which the answer is already known. The user is then asked to read both words. If they solve the one for which the answer is known, the system assumes their answer is correct for the new one. The system then gives the new image to a number of other people to determine, with higher confidence, whether the original answer was correct.
Currently, we are helping to digitize books from the Internet Archive. In order to achieve our goal of digitizing books, we need your help. If you run a website that suffers from problems with spam, you can put reCAPTCHA on your site.

If you can't read the CAPTCHA image, just click the link to get another one -- neither clicking the link nor entering the wrong words will lose your comment (assuming the web works; I always write my comments elsewhere and paste them in, just in case!).

Since there wasn't a Smalltalk plug-in for the reCAPTCHA API, I made my own. It only took about 30 minutes for the client and server sides combined, and most of that was rejigging some bits to avoid adding an extra dependency on an HTTP helper client. Predictably, the result worked -- almost. This is the web, after all. For some reason, the field to enter the words disappeared if the cursor strayed into the TinyMCE JavaScript rich text editor toolbar. Add 6 hours of testing and hacking with newer JavaScript editor versions, IE, different <div> and CSS layouts etc. In the end I dumped the pretty reCAPTCHA frame and went with the longer-winded custom layout. Simple, boring, works perfectly.

If anybody wants the Smalltalk code for the client and server sides, take a look at Blog-ServletsExtensions from the Cincom public repository. This adds reCAPTCHA to the Silt blog server (Silt-Core 1.139), but it should be easy enough to extract the code for use elsewhere: see the package comment for instructions.

Comments

And if it doesn't work...

[Steven Kelly] March 14, 2008 01:21:25 +0200 (EET)

...or if you have problems with it, just leave me a comm... errr, send me an email!