OpenAI tells ChatGPT models to stop talking about goblins

4 hours ago 2

3 hours ago

Liv McMahonTechnology reporter

Getty Images A man wearing a red shirt looks at a laptop screen, with a puzzled expressed on his face.

ChatGPT-maker OpenAI has had to instruct some of its AI tools to stop talking about goblins, after finding the term had randomly crept into responses.

In a blog post on Thursday, the company said it spotted increased mentions of the mythological creatures, as well as gremlins, in metaphors used by ChatGPT and other tools powered by its latest flagship model, GPT-5.

After users and employees flagged problems being described as "little goblins", OpenAI said it took steps to mitigate the issue - including telling its coding agent Codex not to refer to them unless relevant.

It discovered that a "nerdy personality" it developed for ChatGPT had unwittingly been incentivised to reward goblin mentions.

The issue highlights the challenges AI firms face in tackling the potential for systems and their training to reward and reinforce errors like language quirks.

OpenAI said it first noticed increased mentions of goblins, gremlins and other creatures after the launch of GPT-5.1 in November.

"Users complained about the model being oddly overfamiliar in conversation, which prompted an investigation into specific verbal tics," the company wrote in its blog post on Thursday.

It added that after a researcher who had seen a few "goblin" mentions asked it to be checked out, developers found the term's appearance in ChatGPT responses had risen by 175% since GPT-5.1's launch.

They meanwhile found that mentions of "gremlin" rose by 52%.

The increases, while large, may account for a small amount of responses overall.

According to OpenAI, "a single 'little goblin' in an answer could be harmless, even charming," but the uptick in their appearance across output warranted investigation.

'Raccoons, trolls, ogres, pigeons'

Getty Images A small raccoon perched on the side of a tree looks into the lens of the camera.

Ahead of OpenAI's blog post detailing the issue, some social media users flagged a strange detail among lines of code instructing the company's coding assistant Codex how to behave in user interactions.

Alongside telling it to avoid platitudes, it said Codex should "never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user's query".

"Why does GPT 5.5 have a restraining order against 'Raccoons,' 'Goblins,' and 'Pigeons'?"

While some users elsewhere on social media speculated it may be designed to create hype around its AI tools, a company researcher denied this - writing "it really isn't a marketing gimmick," in a reply to a user on X on Wednesday.

OpenAI said in its blog post it added the instruction to curb Codex and its underlying model's "strange affinity for goblins".

The core issue, it explained, seemingly arose while training its models to communicate in the style of particular personalities - in this case with its "nerdy personality".

It found this system would reward mentions of goblins, gremlins and other creatures in metaphors.

While since retired, it said its testing found the personality was responsible for 66.7% of all "goblin" mentions in ChatGPT.

This so-called tic could seep into wider model training if rewarded in one instance and reinforced elsewhere.

The move comes amid a broader industry shift towards making AI chatbots more personality-driven and chatty in a bid to boost user engagement.

As they do, however, experts have warned their potential to make things up - or "hallucinate" as the industry describes it - could intensify.

A recent study by the Oxford Internet Institute found fine-tuning models to have a more warm and friendly personality could result in an "accuracy trade-off", whereby systems make more mistakes or re-affirm a user's false beliefs.

But, like OpenAI's goblin quirk, generative AI mistakes can sometimes be more bizarre and innocuous.