Microsoft's new AI image model ranks second in editing, third in text-to-image

Microsoft announced its MAI-Image-2.5 image generation model on June 2, and it immediately claimed second place in image editing and third in text-to-image on the Artificial Analysis Image Arena, a blind human preference benchmark. The model improved text rendering by 107 points and cartoon, anime, and fantasy imagery by 90 points over the prior version.

Two configurations are available: the standard high-fidelity MAI-Image-2.5 at $47 per million output tokens, and the faster MAI-Image-2.5-Flash at $19.50 per million tokens. It outperforms all Google Gemini image models but still trails OpenAI’s GPT Image 2 variants.

Developers can access the model through Microsoft Foundry and OpenRouter. MAI-Image-2.5 also powers image generation inside PowerPoint and enables precise editing in OneDrive, with pricing designed for enterprise-scale workloads.