Alibaba's Metis agent cuts redundant AI tool calls from 98% to 2% — and gets more accurate doing it

venturebeat.com·Apr 30, 2026

Researchers at Alibaba have developed a new reinforcement learning framework called Hierarchical Decoupled Policy Optimization (HDPO), which trains AI agents to effectively balance the use of internal knowledge and external tools, significantly reducing unnecessary tool invocations and improving reasoning accuracy. Their multimodal model, Metis, demonstrates this capability by achieving state-of-the-art performance while minimizing redundant tool usage, marking a shift towards cultivating AI's metacognitive abilities in tool utilization.

The most valuable insight for you is the introduction of Hierarchical Decoupled Policy Optimization (HDPO) by Alibaba researchers, which significantly addresses the inefficiencies in AI agents related to tool-use abstention. This framework decouples task accuracy and execution efficiency into separate optimization channels, allowing for cleaner learning signals and reducing redundant tool invocations from 98% to just 2% as demonstrated by the Metis model. Implementing HDPO could optimize AI agent performance in real-world applications by minimizing latency bottlenecks and API costs without sacrificing reasoning accuracy.

Want more content like this?

twixb tracks your favorite blogs and social media, filters by keywords, and delivers personalized key learnings — straight to your inbox.

Create Your Own →Explore Newsfeeds

More from AI & Machine Learning News

Recent stories curated alongside this one.

Browse all AI & Machine Learning News →

Alibaba's Metis agent cuts redundant AI tool calls from 98% to 2% — and gets more accurate doing it

Want more content like this?

More from AI & Machine Learning News

Thinking Machines shows off preview of near-realtime AI voice and video conversation with new 'interaction models'

AI agents are running hospital records and factory inspections. Enterprise IAM was never built for them.

Anthropic says ‘evil’ portrayals of AI were responsible for Claude’s blackmail attempts

AI tool poisoning exposes a major flaw in enterprise agent security

Intent-based chaos testing is designed for when AI behaves confidently — and wrongly