Shared from twixb · venturebeat.com

New AI optimization framework beats Claude Code and Codex by 2.5x on the same compute budget

venturebeat.com·Jun 18, 2026

Researchers from Renmin University of China and Microsoft Research developed Arbor, a framework that enhances AI-driven optimization by organizing experiments into a structured learning process, significantly improving performance in real-world engineering tasks. Arbor outperformed existing AI coding agents by over 2.5 times through its unique approach of maintaining a persistent hypothesis tree, allowing for systematic exploration and cumulative learning from past failures.

The Arbor framework offers a structured approach to enhance the autonomous optimization of AI systems by transforming a trial-and-error process into a cumulative learning experience. By organizing experiments and insights into a "Hypothesis Tree Refinement" structure, Arbor enables AI agents to learn from past failures and make verified improvements, resulting in more than 2.5 times the performance gains of standard coding agents. For AI professionals, this framework could automate continuous improvement in complex engineering systems, especially when dealing with tasks that have clear metrics and multiple plausible solutions.

Want more content like this?

twixb tracks your favorite blogs and social media, filters by keywords, and delivers personalized key learnings — straight to your inbox.

Create Your Own →Explore Newsfeeds

More from AI & Machine Learning News

Recent stories curated alongside this one.

Browse all AI & Machine Learning News →

New AI optimization framework beats Claude Code and Codex by 2.5x on the same compute budget

Want more content like this?

More from AI & Machine Learning News

7,000 Langflow servers are under attack. LangGraph and LangChain have the same holes

Fine-tuning forgets. RAG leaks context. Hypernetworks build the model your agent needs on demand.

The Download: AI bottleneck debates, and BCI trials take off

A startup claims it broke through a bottleneck that’s holding back LLMs

Copilot searched your mailbox. LiteLLM handed out admin keys. Run this 5-check audit before your stack is next