MiniMax teases upcoming M3 model with new sparse attention mechanism and 15.6X long-context response speed boost

venturebeat.com·May 27, 2026

MiniMax, a leading Chinese AI company, has released a detailed technical report on its M2 series of language models, showcasing engineering innovations and introducing a new sparse attention approach for its upcoming M3 models, which promises significantly faster decoding speeds for ultra-long contexts. This evolution aims to enhance AI model performance while maintaining high reasoning capabilities, positioning MiniMax as a key player in the competitive AI landscape.

MiniMax's upcoming M3 series introduces a novel "MiniMax Sparse Attention" (MSA) approach that significantly accelerates LLM response speed by 15.6 times during the decoding phase at long contexts, such as a million tokens. This advancement promises to make ultra-long-context AI agent deployment economically viable, offering a strategic advantage for enterprises focusing on efficient AI deployment without compromising reasoning capabilities. For AI developers and enterprises, exploring MiniMax's M2 report and upcoming M3 series could provide actionable insights into optimizing AI model performance and deployment.

Want more content like this?

twixb tracks your favorite blogs and social media, filters by keywords, and delivers personalized key learnings — straight to your inbox.

Create Your Own →Explore Newsfeeds

More from AI & Machine Learning News

Recent stories curated alongside this one.

Browse all AI & Machine Learning News →

MiniMax teases upcoming M3 model with new sparse attention mechanism and 15.6X long-context response speed boost

Want more content like this?

More from AI & Machine Learning News

Nvidia chases $200B CPU market with AI agent PCs from Microsoft, Dell, and HP

MiniMax-M3 debuts, eclipsing GPT-5.5 and Gemini 3.1 Pro on key benchmark performance for just 5-10% of the cost

Anthropic’s browser agent got hijacked 31.5% of the time before safeguards engaged

AI is blowing up music. How should the Grammys handle it?

Claude Mythos exposed a hard truth: Your enterprise patching process is way too slow