pwshub.com

AI bots now beat 100% of those traffic-image CAPTCHAs

Are you a robot? —

I, for one, welcome our traffic light-identifying overlords.

Examples of the kind of CAPTCHAs that image-recognition bots can now get past 100 percent of the time.

Enlarge / Examples of the kind of CAPTCHAs that image-recognition bots can now get past 100 percent of the time.

Anyone who has been surfing the web for a while is probably used to clicking through a CAPTCHA grid of street images, identifying everyday objects to prove that they're a human and not an automated bot. Now, though, new research claims that locally run bots using specially trained image-recognition models can match human-level performance in this style of CAPTCHA, achieving a 100 percent success rate despite being decidedly not human.

ETH Zurich PhD student Andreas Plesner and his colleagues' new research, available as a pre-print paper, focuses on Google's ReCAPTCHA v2, which challenges users to identify which street images in a grid contain items like bicycles, crosswalks, mountains, stairs, or traffic lights. Google began phasing that system out years ago in favor of an "invisible" reCAPTCHA v3 that analyzes user interactions rather than offering an explicit challenge.

Despite this, the older reCAPTCHA v2 is still used by millions of websites. And even sites that use the updated reCAPTCHA v3 will sometimes use reCAPTCHA v2 as a fallback when the updated system gives a user a low "human" confidence rating.

Saying YOLO to CAPTCHAs

To craft a bot that could beat reCAPTCHA v2, the researchers used a fine-tuned version of the open source YOLO ("You Only Look Once") object-recognition model, which long-time readers may remember has also been used in video game cheat bots. The researchers say the YOLO model is "well known for its ability to detect objects in real-time" and "can be used on devices with limited computational power, allowing for large-scale attacks by malicious users."

After training the model on 14,000 labeled traffic images, the researchers had a system that could identify the probability that any provided CAPTCHA grid image belonged to one of reCAPTCHA v2's 13 candidate categories. The researchers also used a separate, pre-trained YOLO model for what they dubbed "type 2" challenges, where a CAPTCHA asks users to identify which portions of a single segmented image contain a certain type of object (this segmentation model only worked on nine of 13 object categories and simply asked for a new image when presented with the other four categories).

Enlarge / The YOLO model showed varying levels of confidence depending on the type of object being identified.

Beyond the image-recognition model, the researchers also had to take other steps to fool reCAPTCHA's system. A VPN was used to avoid detection of repeated attempts from the same IP address, for instance, while a special mouse movement model was created to approximate human activity. Fake browser and cookie information from real web browsing sessions was also used to make the automated agent appear more human.

Depending on the type of object being identified, the YOLO model was able to accurately identify individual CAPTCHA images anywhere from 69 percent of the time (for motorcycles) to 100 percent of the time (for fire hydrants). That performance—combined with the other precautions—was strong enough to slip through the CAPTCHA net every time, sometimes after multiple individual challenges presented by the system. In fact, the bot was able to solve the average CAPTCHA in slightly fewer challenges than a human in similar trials (though the improvement over humans was not statistically significant).

The battle continues

While there have been previous academic studies attempting to use image-recognition models to solve reCAPTCHAs, they were only able to succeed between 68 to 71 percent of the time. The rise to a 100 percent success rate "shows that we are now officially in the age beyond captchas," according to the new paper's authors.

But this is not an entirely new problem in the world of CAPTCHAs. As far back as 2008, researchers were showing how bots could be trained to break through audio CAPTCHAs intended for visually impaired users. And by 2017, neural networks were being used to beat text-based CAPTCHAs that asked users to type in letters seen in garbled fonts.

Older text-identification CAPTCHAs have long been solvable by AI models.

Older text-identification CAPTCHAs have long been solvable by AI models.

Stack Exchange

Now that locally run AIs can easily best image-based CAPTCHAs, too, the battle of human identification will continue to shift toward more subtle methods of device fingerprinting. "We have a very large focus on helping our customers protect their users without showing visual challenges, which is why we launched reCAPTCHA v3 in 2018," a Google Cloud spokesperson told New Scientist. "Today, the majority of reCAPTCHA's protections across 7 [million] sites globally are now completely invisible. We are continuously enhancing reCAPTCHA."

Still, as artificial intelligence systems become better and better at mimicking more and more tasks that were previously considered exclusively human, it may continue to get harder and harder to ensure that the user on the other end of that web browser is actually a person.

"In some sense, a good captcha marks the exact boundary between the most intelligent machine and the least intelligent human," the paper's authors write. "As machine learning models close in on human capabilities, finding good captchas has become more difficult."

Source: arstechnica.com

Related stories
4 hours ago - Researchers from ETH Zurich have devised a machine learning program that can solve Google reCAPTCHA v2 image recognition challenges with perfect accuracy. Although these often-maligned tests are becoming obsolete, they still play an...
4 days ago - When robots.txt just ain't cutting the mustard Cloudflare on Monday expanded its defense against the dark arts of AI web scrapers by providing customers with a bit more visibility into, and control over, unwelcome content raids.…
5 days ago - Cloudflare may charge an app store-like fee for its AI-scraping data marketplace.
1 month ago - Victor Miller has skirted attempts by OpenAI and the state to shut down his campaign, believed to be the first U.S. effort of its kind.
1 month ago - AI chatbots are taking the world by storm. We analyzed thousands of conversations to see what people are really asking them and what topics are most discussed.
Other stories
21 minutes ago - With Google, Microsoft, Apple, Anthropic, Perplexity and OpenAI creating new AI models and systems, it's good to stay up to date on all the latest terminology.
22 minutes ago - This indie horror gem and its bewildering sequel are my favorite horror one-two punch.
22 minutes ago - A breakdown of the full process we use to test web hosting services for our reviews and best-of lists.
22 minutes ago - This Dyson bladeless fan is now just $300 at QVC, just in time for cooling temperatures.
1 hour ago - You might be feeling anxious about the upcoming presidential election. These expert tips can help reduce your anxiety.