Chinese AI models struggle with code generation when tasked for the US government, resulting in a 130% increase in vulnerabilities. Booz Allen Hamilton's report, released June 5, analyzed over 2,800 trials with four large language models, identifying significant security issues, especially with Alibaba's Qwen3-Coder.

Kimi K2.5 performed best among the tested models, but overall trends raised concerns. In contrast, US-developed Anthropic's Claude Opus 4.6 produced more secure code under similar prompts. Booz Allen recommends restricting untrusted AI models from sensitive environments and enhancing code audits.

The crypto sector should heed these findings as many protocols rely on AI-generated code. Vulnerabilities in that code can create significant security risks. With the rise of models like DeepSeek in global development, context-specific factors influencing security must be examined.

These findings emerge amid heightened US-China tech competition, suggesting AI models themselves can be supply chain risks, prompting regulatory implications for US firms like Anthropic and OpenAI.