New benchmarks from Estonia's Language Institute show that open-weight large language models (LLMs) like Nvidia's Nemotron and Alibaba's Qwen are among the best at resisting Russian propaganda, matching or exceeding top-tier systems from Anthropic.
OpenAI's GPT-5.4 also performed well, providing 'exemplary' responses 54% of the time and achieving a mean score of 88.9 on the benchmark.
Unsurprisingly, newer frontier models show much stronger resistance than those from just a few years ago. Claude 3.5 Haiku-the top-rated 2024 release-would fall into the bottom third of 2026 models on this metric.
However, progress has been uneven. Google's most propaganda-resistant model, Gemini 2.5 Pro, is nearly a year old and scored only 82 on the benchmark, showing particular susceptibility to maliciously worded prompts. Its newer Gemini 3.5 Flash scored just 73, comparable to Anthropic models from 2024.
The study also found that many models showed significantly less resistance when prompted in Russian, including Google's Gemini 3.5 Flash and open-weight models like Moonshot's Kimi K2 and StepFun's Step 3.5 Flash.
The findings come as Russia actively seeks to influence AI models through technical alliances with other BRICS countries, projecting sociopolitical positions as 'culturally sensitive' to its own viewpoints.