Shared from twixb · latent.space

[AINews] GPT-Realtime-2, -Translate, and -Whisper: new SOTA realtime voice APIs

latent.space·May 8, 2026

OpenAI has launched three new real-time voice models—GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper—enhancing voice interaction capabilities with improved reasoning, longer context retention, and the ability to handle interruptions and tool usage. These advancements aim to transform voice agents into more responsive and capable systems suitable for various applications, including customer support and live translation.

The key insight for you is that OpenAI's GPT-Realtime-2 introduces a significant advancement in real-time voice AI, supporting longer context (128K tokens), tool use, and adjustable reasoning levels. This evolution emphasizes designing voice apps as stateful real-time systems, not just prompt-response endpoints, which is crucial for developing sophisticated AI agents capable of handling complex, continuous interactions in real-time scenarios.

Want more content like this?

twixb tracks your favorite blogs and social media, filters by keywords, and delivers personalized key learnings — straight to your inbox.

Create Your Own →Explore Newsfeeds

More from AI Productivity

Recent stories curated alongside this one.

Browse all AI Productivity →

[AINews] GPT-Realtime-2, -Translate, and -Whisper: new SOTA realtime voice APIs

Want more content like this?

More from AI Productivity

[AINews] OpenAI reports median internal Codex output tokens grew 56x in Research, 32x in Customer Support, 27x in Engineering, and 13x in Legal since November 2025.

Evaluating performance and efficiency of the GitHub Copilot agentic harness across models and tasks

AI and Liability

simonw/browser-compat-db

Why the Frontier Ecosystem must be Open — Matei Zaharia and Reynold Xin, Databricks