Shared from twixb · theverge.com

OpenAI talks about not talking about goblins

theverge.com·Apr 30, 2026

OpenAI has addressed the unexpected increase in references to goblins and similar creatures in its AI models, particularly following the introduction of the "Nerdy" personality in GPT-5.1. The company found that this behavior stemmed from reinforcement learning that rewarded these quirky metaphors, and while they have since discontinued the Nerdy personality, some references persist, prompting OpenAI to issue specific instructions to mitigate the issue.

The key insight from the content is OpenAI's experience with unintended behavior in their models due to reinforcement learning, specifically how the "Nerdy" personality in GPT-5.1 led to widespread and unintended quirky metaphors across different models. This highlights the importance of careful oversight and adjustment in reinforcement learning processes to prevent undesired propagation of behaviors, which is crucial for maintaining model integrity and safety in AI deployment.

Powered by twixb

Want more content like this?

twixb tracks your favorite blogs and social media, filters by keywords, and delivers personalized key learnings — straight to your inbox.

More from AI & Machine Learning News

Recent stories curated alongside this one.