Join The Community

Search

February 08, 2010

What is CAPTCHA

captcha

Spam has become the common enemy on the Internet ranging from emails, IRC, IM, forums, wikis, to blogs. To support the illegal activity, spammers create machines spammers called "spam bots" which works for 24 hours a day, sending junk messages in all Internet services. One effort to minimize the spread of spam on the WWW created CAPTCHA, a method to differentiate human with spam bots.

At the end of the registration form a new email account, lies above submit button, there is now a simple test, which included a series of letters, numbers or a combination of both, which is listed on an image. For humans, these tests may be very easy, but not for the computer (spam bots).

Testing methods which require users to type an existing object in the image is called a CAPTCHA (Completely Automated Public Turing Test to Tell Computers and Human Apart), a type of HIP (Human Interaction Proof) is the most widely used in the current WWW.

Most of CAPTCHA using the image contains a series of letters and numbers printed irregular (distorted) so can not be recognized by the spam bots are equipped with OCR technology (Optical Character Recognizer) though. If the text is typed in accordance with the objects listed on the image, it means you're human!

CAPTCHA and Turing Test

Although reverse direction, the basic idea in line with the CAPTCHA Turing Test. Alan Turing, computer scientist British nationality, in 1950 had proposed a test, whether a computer can think like humans. In the Turing Test, an isolated test question to the two panelists: humans and computers.

Of the two panelists answer, the examiner must determine exactly where the answer given mankind, and which are generated by a computer on. If the testers can not distinguish with certainty which man and which the computer, the computer is considered to have passed the Turing Test.

From the description above, we can see that the Turing Test which examiners are human, and the object being tested is human and the computer. If the testers can not distinguish with certainty which man and which the computer, the computer passed the test.

In the CAPTCHA, what happened was the opposite. Which acts as a tester is a computer, while the object being tested still human and computer. If the examiner can distinguish with certainty which man and which the computer, then the man considered to have passed.

Pareidolia and CAPTCHA

recaptcha-example

Although not all of them, most of CAPTCHA using visual tests. Computer difficulties processing visual data, while not a human. Humans can see pictures and find a pattern in it more quickly and more easily than a computer.

The human mind is not unusual to catch a pattern or shape that does not exist. For example, we often see faces or animal shapes in clouds or the surface of the moon.

The human brain will associate the random information that caught the eye becomes a pattern or shape. In psychology, this phenomenon known as "pareidolia". However, not all CAPTCHAs rely on visual tests. Some types of CAPTCHA using sound or audio.

Design CAPTCHA

50739

The first step in designing a CAPTCHA is to see the difference between humans and computers in information processing. Computer follows a set of instructions. Computers can not reach anything outside world (the instructions).

When making a test, a designer CAPTCHA should pay attention to this. For example, create a program that can read metadata (information on the web readable by a computer, but not by humans) is that easy.

Thus, there is no point in making a visual CAPTCHA if the solution includes the image metadata that is used. Just less wise to make a CAPTCHA without any distortion of letters and numbers. Some spam bots are equipped with OCR that can read and recognize letters and numbers on the image.

The only way to make a CAPTCHA is to define the image and the solution in a database. According to researchers from Microsoft Research, 80% CAPTCHAs can be solved by humans, while the computer is only able to solve 1 of 100 tests (1%).

Distortion of letters and numbers are used to confuse spam bots with OCR. Some letters and numbers given the unusual shape (as seen through the writings of the molten glass). Some broke the form of letters and numbers as written on lattices.

Some again using a different color, or add other elements such as points or irregular patches. It's all done with one goal: make spam bots read the series of letters and numbers shown on the image.







Share to Facebook Share to Twitter Share to MySpace Stumble It Share to Reddit Share to Delicious More...

0 comments:

Post a Comment