AI Agents Are Getting Smarter. Their Safety Disclosures Are Falling Behind.
AI agents can now book your meetings, write your code, and send emails on your behalf. But here’s the uncomfortable truth: most developers won’t tell you how they tested these tools for safety.
A new study led by MIT researchers looked at 67 deployed agentic AI systems and found a troubling pattern. Developers love talking about what their agents can do. They’re far less eager to explain whether those agents are safe to use.
The Year of the AI Agent
Something significant shifted recently in the AI world. Agents went from a niche concept to a mainstream obsession almost overnight.
Tools like OpenAI’s agent features, OpenClaw, and Moltbook grabbed attention by promising something previous AI tools couldn’t deliver. Instead of just answering questions, these systems actually do things. They browse the web, write and run code, manage files, and complete multistep tasks with minimal human oversight.
That last part is exactly what makes them appealing. You describe a goal, and the agent figures out the steps to get there on its own. No hand-holding required.
The appeal is obvious. But so is the risk.
What Makes Something an AI Agent

Not every chatbot qualifies as an agent. The MIT researchers were specific about this.
To count as an agentic system, a tool had to pursue goals over time with minimal human involvement. It also had to take actions that affect real environments. We’re talking about systems that break a broad instruction into smaller tasks, use external tools, plan ahead, and then iterate when things don’t work.
That autonomy is the whole point. It’s also what raises the stakes considerably.
When a regular AI model makes a mistake, the damage is contained. It spits out a bad answer, and you move on. But when an agent has access to your files, your email, your calendar, or your financial accounts, a mistake doesn’t stay contained. It can ripple across steps, touching multiple systems before anyone notices something went wrong.
Capability Disclosures Are Everywhere. Safety Disclosures Are Not.
Here’s where the MIT AI Agent Index findings get genuinely concerning.
About 70% of the 67 indexed agents provide some form of documentation. Nearly half publish their code. So far, so good.

But only about 19% disclose a formal safety policy. Fewer than 10% report any external safety evaluations. That means nine out of ten agentic systems don’t share evidence that an independent party ever tested them for risk.
Developers are comfortable sharing demos. They’re happy to publish benchmark scores and usability guides. But detailed internal testing procedures, third-party risk audits, and formal safety policies? Those stay private.
“Leading AI developers and startups are increasingly deploying agentic AI systems that can plan and execute complex tasks with limited human involvement,” the researchers wrote. “However, there is currently no structured framework for documenting safety features of agentic systems.”
That’s a polite way of saying the gap is serious and nobody has fixed it yet.
Why This Lopsided Transparency Matters
Some might argue that safety evaluations don’t need to be public. Fair enough. But consider the context.
Many of the agents cataloged in the study operate in software engineering and computer use environments. These aren’t toys. They handle sensitive data and perform actions with real consequences. When these systems move from prototype demos to tools integrated into actual business workflows, the absence of public safety information stops being a minor issue.

Think of it this way. Imagine buying a car where the manufacturer enthusiastically detailed the horsepower, the infotainment system, and the cargo space, but refused to share crash test results. You’d have every reason to feel uneasy. Yet that’s roughly the situation with most deployed AI agents today.
The MIT study doesn’t claim agentic AI is universally dangerous. That’s not what the data shows. What it does show is a clear mismatch: as the autonomy and capability of these systems increases, structured public transparency about safety has not kept pace.
The Gap Nobody Wants to Talk About
There’s a reason developers emphasize capability over safety. Capability sells. Safety documentation is harder to make exciting in a product demo or a press release.
But the researchers behind the MIT AI Agent Index are essentially arguing that the industry needs a shared standard. Right now, there’s no common framework for documenting what safety testing even looks like for an agentic system. So every developer makes their own choices about what to share, which means most choose to share very little.
That’s not sustainable as these tools become more deeply embedded in daily work and personal life.
If you’re using an AI agent today, it’s worth asking some basic questions before you hand it access to your inbox or your codebase. Can you find a safety policy? Has the tool been evaluated by anyone outside the company that built it? What happens when it makes a mistake on step four of a twelve-step task?
For most tools currently on the market, you’ll struggle to find satisfying answers. The technology is moving fast. The guardrails, at least the ones you can actually see and verify, are still lagging far behind.