Anthropic’s browser agent got hijacked 31.5% of the time before safeguards engaged

venturebeat.com·Jun 1, 2026

Anthropic's latest model, Claude Opus 4.8, has been found to have a prompt injection vulnerability rate of 31.5% before safeguards are engaged, significantly higher than the disclosures from competitors OpenAI, Google, and Meta, which lack comparable metrics and transparency. The absence of standardization in measuring these vulnerabilities complicates the evaluation of AI security across different vendors, leaving buyers to manage their own risk exposure.

The key insight for an AI professional is the significant variability in how frontier labs like Anthropic, OpenAI, Google, and Meta disclose and measure prompt injection vulnerabilities in their AI models. Anthropic's detailed per-surface analysis reveals a 31.5% hijack rate in browser environments before safeguards, emphasizing the critical need for robust, surface-specific security assessments. This underscores the importance of demanding detailed, per-surface security metrics from vendors and conducting independent red-team evaluations to accurately assess and mitigate AI deployment risks.

Want more content like this?

twixb tracks your favorite blogs and social media, filters by keywords, and delivers personalized key learnings — straight to your inbox.

Create Your Own →Explore Newsfeeds

More from AI & Machine Learning News

Recent stories curated alongside this one.

Browse all AI & Machine Learning News →

Anthropic’s browser agent got hijacked 31.5% of the time before safeguards engaged

Want more content like this?

More from AI & Machine Learning News

Nvidia chases $200B CPU market with AI agent PCs from Microsoft, Dell, and HP

MiniMax-M3 debuts, eclipsing GPT-5.5 and Gemini 3.1 Pro on key benchmark performance for just 5-10% of the cost

AI is blowing up music. How should the Grammys handle it?

Claude Mythos exposed a hard truth: Your enterprise patching process is way too slow

How Turkey Hacked the Hair Transplant Industry