Artificial intelligence firm Anthropic claims three prominent Chinese AI companies are illegally harvesting data from its Claude chatbot. The alleged data extraction aims to accelerate the development of their own AI platforms.
Anthropic identified DeepSeek Ltd., Moonshot, and MiniMax as the companies involved. The firm asserts these companies created thousands of fraudulent Claude accounts, generating millions of conversations. This data, Anthropic states, is then used to train their proprietary chatbots. Specifically, DeepSeek is accused of 150,000 interactions, Moonshot over 3.4 million, and MiniMax an estimated 13 million.
While using data from one AI to train another, known as "distillation," is a common technique, Anthropic's terms of service prohibit such practices. The company also aims to prevent its chatbot's use within China.
These accusations follow similar claims from rival OpenAI regarding Chinese firms harvesting data from ChatGPT. OpenAI has reported these activities to the U.S. House Select Committee on China, citing "new and obfuscated" distillation techniques intended to "free-ride" on U.S. technology.
Anthropic warns that this data harvesting poses a significant national security risk. They suggest the illicitly obtained data could be used to develop advanced military weapons or mass surveillance tools. Despite existing safeguards, Anthropic notes these can be compromised during the distillation process.
The company is urging U.S. government action, stating the threat is escalating and requires immediate, coordinated efforts from industry, policymakers, and the global AI community.
Adding complexity to the situation, Anthropic has reportedly faced challenges with the U.S. Department of Defense over its policies regarding autonomous weapons and surveillance tools, despite the Pentagon using a specialized version of Claude. The company's recent settlement of a significant copyright infringement lawsuit over training data also draws criticism, with some accusing Anthropic of hypocrisy given its own data sourcing practices.