How easily can Russian propaganda fool AI models? A new benchmark finds out
Photo: the-decoder.com

How easily can Russian propaganda fool AI models? A new benchmark finds out

Originally reported by The Decoder

"AI models struggle to resist propaganda, raising concerns about misinformation."

Estonia's Institute of the Estonian Language recently tested 60 AI language models. The benchmark, which evaluated the models' susceptibility to Russian propaganda, found that many struggled to distinguish fact from fiction. The test, which covered 14 propaganda narratives in three languages, revealed that some models were more prone to repeating Russian talking points than others. Anthropic's Claude models performed well, while Mistral's models landed in the bottom third.

The benchmark used a calibrated Claude Opus 4.5 as the evaluation model, validated by disinformation experts at Propastop. Each answer was scored on a scale of 1 to 5, where 1 meant the model repeated Russian talking points. The results showed that Nvidia's Nemotron 3 and Alibaba's Qwen 3.6 Plus also performed well, while Mistral's models, including the newest Medium 3.5, struggled to keep up. The poor performance of Mistral's models is particularly concerning, given the company's position as a European alternative to US and Chinese providers.

The threat of Russian propaganda is real, and it's not just limited to traditional media outlets. Russian networks like "Pravda" deliberately feed AI systems millions of disinformation articles, which can then be spread quickly and efficiently. OpenAI recently shut down a Russian campaign that used ChatGPT to spread propaganda ahead of Germany's federal election, highlighting the need for AI models to be able to resist propaganda. The Estonian Institute's benchmark provides a much-needed assessment of the current state of AI language models and their ability to resist propaganda.

The implications of the benchmark are significant, particularly for companies like Mistral that are positioning themselves as alternatives to US and Chinese providers. Mistral is currently negotiating a 3 billion euro funding round at a 20 billion euro valuation, but the company's poor performance in the benchmark may raise concerns among investors. The results also highlight the need for greater transparency and accountability in the development of AI language models, particularly when it comes to their ability to resist propaganda.

The Estonian Institute's benchmark is a crucial step towards understanding the vulnerabilities of AI language models and developing strategies to mitigate them. By testing the models' ability to resist propaganda, the benchmark provides a unique insight into the strengths and weaknesses of different AI models. The results of the benchmark will be useful for developers, policymakers, and anyone concerned about the spread of misinformation. As AI language models become increasingly ubiquitous, it's essential to ensure that they are able to distinguish fact from fiction and resist propaganda.

The benchmark also raises important questions about the role of AI in the spread of misinformation. As AI language models become more advanced, they have the potential to amplify propaganda and disinformation, making it more difficult to distinguish fact from fiction. The Estonian Institute's benchmark highlights the need for AI models to be designed with propaganda resistance in mind, and for developers to prioritize transparency and accountability in the development of these models.

In conclusion, the Estonian Institute's benchmark provides a timely and important assessment of the vulnerabilities of AI language models. The results highlight the need for greater transparency and accountability in the development of these models, particularly when it comes to their ability to resist propaganda. As AI language models become increasingly ubiquitous, it's essential to ensure that they are able to distinguish fact from fiction and resist propaganda. The benchmark is a crucial step towards achieving this goal, and its results will be useful for developers, policymakers, and anyone concerned about the spread of misinformation.