This is awesome, and soon to be a part of our blog sites!

They are using scanned words from old books which OCR (optical character recognition or something like that) was unable to correctly decipher. By feeding this to humans, Carnegie Mellon is helping OCR software get better. :-)

Hmmm. I just read through the reCAPTCHA terms of service, and wonder did a lawyer actually write this:

Ownership of Data and Information; Carnegie Mellon's Use of Personally-Identifiable Information. All data and information generated from the access and use of the Website and Service, including any image solved (whether or not correct) shall be the property of Carnegie Mellon, and no third party, including you, shall have the right to own such data and information, or use such data or information except as expressly authorized by these Terms and Conditions. By using the Website and Service, you automatically assign to Carnegie Mellon any rights in the data or information generated from the access and use of the Website and Service, including any image solved (whether or not correct) of yours and third party users of your website providing interpretations of images (and you agree to make sure that the third party users of your website assign these rights in the data and information to Carnegie Mellon).

That’s one of the more ridiculous fine prints I’ve read in awhile. How in the world could any site owner sanely agree to “make sure that third party users of [their] website assign [those] rights in the data and information to Carnegie Mellon”? That is a practically impossible feat.

Sorry Carnegie Mellon, everybody doesn’t win, especially with wacky terms like the ones you are including in your service.

I believe that the mod_defensible solution described here is a better solution than both Spam Karma 2 and reCAPTCHA. Of course I’ll have to try it out!

¥