pwshub.com

New Relief for AI Bot Sufferers: Cloudflare’s New Tool Lets Sites Charge For Data Scraping

San Francisco-based cloud services company Cloudflare launched a new set of AI tools Monday that aims to give websites the ability to stop unauthorized scraping by AI crawlers—or to charge them for access to their data.

“What we've previewed today is the ability for site owners and internet publications to say, ‘this is the value I expect to receive from my site,’” Sam Rhea, a Cloudflare vice president, told Decrypt. “If you're an AI LLM and you want to scan this content or train against it, or make it part of your search result, this is the value I expect to receive for that.”

Today, Cloudflare is releasing a set of tools to make it easy for site owners, creators, and publishers to take back control over how their content is made available to AI-related bots and crawlers. https://t.co/R239wtO3iB #BirthdayWeek

— Cloudflare (@Cloudflare) September 23, 2024

The free Cloudflare Bot Management platform allows websites to not only block AI bots but to charge a fee to as many bots as they approve, thereby getting revenue for the platforms feasting for free on their content.

The AI audit tool also gives users the ability to see how its content is being accessed.

As Rhea explained, unlike malicious bots that try to crash websites or cut in line ahead of human customers attempting to access a website, AI crawlers don’t aim to harm or steal but scan public content to train large language models.

Sometimes those bots attribute the information back to the source, plausibly sending valuable traffic, Rhea said. “But other times, they take material, put it in a blender, and share it as if it were just part of a generic source, without any citation. That seems dangerous to me.”

Rhea said as far as Cloudflare, which provides security and performance optimization for websites, could tell, no single platform dominates website scraping activity, adding that it varies by the type of content being scraped at any given time.

Generative AI models require large amounts of data to function and attempt to provide fast and accurate answersas well as create images, videos, and music. AI scrapers are a growing industry and include companies like LAION, Defined.AI, Aleph Alpha, and Replicate that provide AI developers with pre-collected text, voice, and image datasets. According to market research firm Research Nester, the web scraping software industry is estimated to reach $2.45 billion by 2036.

Last year, Ed Newton-Rex, the former head of audio at Stability AI, resigned over how AI platforms claimed that ingesting website data was “fair use.”

“‘Fair use’ wasn’t designed with generative AI in mind — training generative AI models in this way is, to me, wrong,” he said. “Companies worth billions of dollars are, without permission, training generative AI models on creators’ works, which are then being used to create new content that in many cases can compete with the original works.”

Newton-Rex added: “I don’t see how this can be acceptable in a society that has set up the economics of the creative arts such that creators rely on copyright.”

I’ve resigned from my role leading the Audio team at Stability AI, because I don’t agree with the company’s opinion that training generative AI models on copyrighted works is ‘fair use’.

First off, I want to say that there are lots of people at Stability who are deeply…

— Ed Newton-Rex (@ednewtonrex) November 15, 2023

Rhea said smaller AI developers seemed willing to pay to receive selected website content.

“From the conversations we've had with foundational model providers and new entrants in the space, is that the kind of ocean of high-quality data is becoming difficult to find,” he said, noting that scientific and mathematical content was especially in demand.

Edited by Josh Quittner and Sebastian Sinclair

Generally Intelligent Newsletter

A weekly AI journey narrated by Gen, a generative AI model.

Source: decrypt.co

Related stories
1 week ago - As the cryptocurrency landscape continues to evolve, meme coins and innovative AI projects are disrupting traditional decentralized finance (DeFi) ecosystems. GoodEgg (GEGG), a Play 2 Date AI meme sensation, is capturing significant...
1 month ago - Under the act, victims will be able to take action against “digital forgeries” using their likeness for the proliferation of sexual content.
1 month ago - After dropping his previous lawsuit in June, Elon Musk has stepped up his claims against OpenAI and CEO Sam Altman in his latest filing.
1 week ago - Dogecoin (DOGE), once the reigning meme coin, now finds itself in a struggle to retain its momentum, leaving many long-term holders reconsidering their faith in the cryptocurrency. With recent data showing dwindling returns for Dogecoin...
1 week ago - The memecoin space has seen explosive growth in 2024, with the likes of Floki (FLOKI) and Mpeppe (MPEPE) leading the charge. However, a new player is gaining attention for its innovative approach—GoodEgg (GEGG). This AI-powered dating...
Other stories
53 seconds ago - XRP price is consolidating above the $0.5785 support. The price must settle above $0.5920 and $0.600 to start a fresh increase in the near term. XRP price is still trading below the $0.600 resistance zone. The price is now trading below...
1 minute ago - Memecoin Kamala Horris (KAMA) has seen its first notable increase in over a month, rising 7% in the last 24 hours, in response to comments made by US Vice President Kamala Harris, who pledged to support the growth of the crypto space if...
1 minute ago - Ethereum price extended its increase above the $2,650 resistance. ETH is now correcting gains and might find bids near the $2,600 support. Ethereum is currently correcting gains from the $2,700 resistance. The price is trading above...
10 minutes ago - The Bitcoin options market is beginning to see "reflexivity season" kick in as traders focus on prices between $80,000 and $90,000.
1 hour ago - Bitcoin price gained pace above the $63,500 resistance. BTC tested the $64,800 zone and is currently correcting gains. Bitcoin is correcting gains from the $64,800 zone. The price is trading below $63,500 and the 100 hourly Simple moving...