A Young Agent Weeps Because He Caused Chaos in the Kitchen

April 3, 2026

green-dino_thumbAnother dinobaby post. No AI unless it is an image. This dinobaby is not Grandma Moses, just Grandpa Arnold.

I am still thinking about a blue chip consulting firm’s confidence that its MBAs and CPAs can stop agentic software from making already wonky business processes more problematic. Why? Creating a fix for today’s smart software means very little tomorrow. Advances in smart software come less frequently than marketing baloney is output by these firms. Adding to the wonkiness is the idea that taking action today will ameliorate some future unknown bad action.

image

Thanks, Midjourney. Good enough.

Why am I confident in my skepticism? Well, for me. Navigate to Science.org’s article “AI Algorithms Can Become Agents of Chaos.” The write up asserts:

The agents proved trustworthy in five of the tests, which relied on OpenClaw, a “personal digital assistant” that harnesses AI agents to do a user’s bidding by controlling other software. They declined to spread AI disinformation or edit stored email addresses when asked, for example. But in 11 cases they went rogue, sharing private files—containing medical details and Social Security and bank account numbers—without permission or deploying useless looping programs that hogged costly computer time. One agent publicly posted a potentially libelous allegation about a fictitious person.

You can read the details of this agent / chaos analysis in the ArXiv paper “Agents of Chaos.”

The Science.org article states:

The study did not pinpoint why the breakdowns occurred. One crucial question is whether the failures stem from flawed programming that human designers can improve versus an “emergent” feature that arises spontaneously, says Yonatan Belinkov, a computer scientist at the Technion-Israel Institute of Technology who is on leave at Harvard University. Another is whether the problem worsens when multiple agents collaborate. A few of the Agents of Chaos case studies examined two agents working together, but already, Belinkov notes, these AIs are engaging on a much larger scale: Millions are chatting with one another on a social media platform, Moltbook, launched in January, where they have already reportedly created a new religion.

Yep, lawyers will decide liability. How confident am I? I am good with 90 percent confidence based on my technology experience. Are you going to let a BAIT (big AI tech) company decide if it is responsible for a disaster? What about letting the client decide when the client will assert that the marketing presentation did not include the equivalent of the sinking of the HMS Titanic? Will a government body decide? No, but the government professionals will have a working lunch, hire outside advisors, and create a white paper. Then the lawyers will decide.

What’s the fix for a hallucinating agent, bad coding, or a customer who just assumes the system is A-OK? The article presents some ideas:

Potential remedies for misbehaving AI agents include automated processes to undo harmful changes they make to other software and data, the preprint says. But training AI agents to distinguish between instructions with helpful versus malicious intent remains a major technical challenge, Cohen says. Currently, computer scientists lack the technical means to reliably constrain agents “so they don’t just do crazy things that you can’t really control.”

Net net: One can promise many things. Saying one knows how a future agentic system will function, malfunction, or just go off the rails strikes me as the equivalent of predicting where a two year old will throw apple sauce. I can predict a mess. I cannot predict where however.

Stephen E Arnold, April 3, 2026

Comments

Got something to say?





  • Archives

  • Recent Posts

  • Meta