MiniMax Teases M3 Model With Dramatic Speed Leap

MiniMax, the Shanghai-based AI lab backed by Tencent, Alibaba, and miHoYo, has released a technical report on its M2 model series. Inside the report lies a teaser for the next-generation M3 model, which the company claims achieves a 15.6x faster decoding speed and 9.7x faster prefill speed compared to M2 when processing 1-million-token contexts.

The performance boost comes from a technique called MiniMax Sparse Attention, or MSA. Built on GQA-driven dynamic block selection, MSA intelligently selects only the relevant blocks of data within a massive context window, dramatically reducing compute requirements while maintaining output quality comparable to M2.

Founded in early 2022, MiniMax listed on the Hong Kong Stock Exchange in January 2026. Its investors include Tencent, Alibaba, and miHoYo, the studio behind Genshin Impact. Beyond text and code, MiniMax operates the Hailuo platform for video generation, with Hailuo 2.3 having processed billions of results.

No parameter count, licensing details, or release timeline for M3 have been confirmed. The key question for decentralized AI investors is whether the MSA architecture will be open-sourced alongside the model weights.