Cracked ChatGPT logo pierced by giant em dash symbol with AGI distant

ChatGPT Can’t Follow Simple Punctuation Rules. AGI Looks Farther Away Than Ever

OpenAI’s CEO just celebrated a “small win” that reveals a massive problem.

Sam Altman posted Thursday that ChatGPT finally obeys custom instructions to avoid em dashes. After three years of trying. That’s not the flex he thinks it is.

This tiny victory exposes something bigger. If the world’s most valuable AI company struggles to control basic punctuation, claims about artificial general intelligence ring hollow. Let’s examine why this matters more than it seems.

Em Dashes Became ChatGPT’s Tell

AI-generated text developed a reputation for overusing em dashes. Those long horizontal lines (—) that set off clauses or interrupt thoughts.

Readers now spot them as red flags. Detection tools flag their frequency. Human editors recognize the pattern instantly.

The phenomenon got so bad that journalists complained AI was “killing” the em dash. Writers who naturally favor this punctuation mark found their work questioned. Was it human or machine?

Nobody knows exactly why language models love em dashes. Some theories point to 19th-century books in training data. Others blame Medium’s automatic character conversion. But the real answer is simpler.

Training Data Created the Habit

Traditional software follows rules deterministically unlike language models

Large language models output patterns from their training data. Feed them millions of articles with em dashes, and they’ll produce more em dashes.

Plus, human feedback during training matters. If evaluators rated responses with em dashes as more sophisticated, the model learned to include them.

So ChatGPT isn’t choosing to use em dashes. It’s statistically predicting that em dashes belong in professional writing. Because they frequently appeared in its training examples.

That’s the core issue Altman glossed over. The model doesn’t understand what an em dash is or why someone might not want them. It just calculates probabilities.

Custom Instructions Don’t Actually Work Like Instructions

Here’s where things get interesting. ChatGPT’s custom instructions aren’t real instructions in any meaningful sense.

Traditional software follows rules deterministically. Tell a program “don’t include character X” and it won’t. The code executes exactly as written.

Large language models work differently. Your instruction just adds text to the prompt. That influences token probabilities but creates no hard rules.

When you write “don’t use em dashes” in custom instructions, you’re making em dash tokens less likely. Not impossible. Just less probable during text generation.

Every word the model outputs comes from a probability distribution. Your preference competes against training data patterns, previous chat context, and everything else in the prompt.

There’s no separate verification system checking outputs against your requirements. The instruction is simply more text that shifts statistical predictions.

The Win Might Not Last

Altman’s celebration assumes this problem stays fixed. But OpenAI continuously updates models behind the scenes.

Each update adjusts outputs based on new feedback and training runs. Changes aimed at improving coding might inadvertently bring em dashes back.

Researchers call this the “alignment tax.” Precisely tuning neural network behavior isn’t exact science. All concepts in the network connect through interconnected weight values.

Fix one behavior today, and tomorrow’s update might undo it. Not because OpenAI wants that outcome, but because steering a statistical system with millions of competing influences remains unpredictable.

Our informal testing showed mixed results. GPT-5.1 followed our custom instructions. But X users reported varied experiences, especially when requesting no em dashes within regular chat.

AGI Requires Something LLMs Don’t Have

Traditional software follows rules deterministically versus language model probabilities

If controlling punctuation still challenges OpenAI after three years, artificial general intelligence looks distant.

AGI would replicate human general learning ability. That requires true understanding and intentional action. Not statistical pattern matching that sometimes aligns with instructions if you get lucky.

One X user captured the problem perfectly. They told ChatGPT in-chat to avoid em dashes. The model replied: “Got it—I’ll stick strictly to short hyphens from now on.”

Then immediately used an em dash in its response.

That’s not understanding. That’s probability calculation producing ironic outputs. The model generated text promising to avoid em dashes while simultaneously predicting an em dash belonged in that sentence.

The Real Irony Nobody Mentions

Altman frequently discusses AGI, superintelligence, and “magic intelligence in the sky” while raising billions for OpenAI. He paints visions of transformative AI that rivals or exceeds human capability.

Meanwhile, his flagship product struggles with formatting preferences. After years of development and countless training iterations.

The disconnect reveals something important. We don’t have reliable artificial intelligence today on Earth. We have powerful statistical prediction engines that sometimes do what we want.

Sometimes. If we phrase requests correctly. And get lucky with how the probabilities fall. And hope the next update doesn’t break things.

Em dashes became ChatGPT's tell in AI-generated text detection

That’s useful technology. Just not AGI.

Control Matters More Than Capability

The em dash saga highlights a crucial distinction. Capability doesn’t equal controllability.

ChatGPT can write code, analyze data, and generate creative content. Impressive capabilities. But if OpenAI can’t reliably control something as simple as punctuation use, what does that say about controlling more consequential behaviors?

This matters for AI safety discussions. If instruction following remains probabilistic rather than deterministic, how do we ensure AI systems behave safely in critical applications?

The industry keeps promising breakthrough capabilities while basic controllability remains elusive. That seems backwards.

Perhaps we should solve the fundamentals before chasing AGI. Master reliable instruction following before claiming human-level intelligence. Ensure consistent behavior before deploying AI in high-stakes scenarios.

Altman’s “small win” actually reveals a big problem. One the industry would rather not discuss while raising billions on AGI promises.

The em dash issue isn’t about punctuation. It’s about whether we truly control these systems at all.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *