Systems

Demos Lie

May 26, 2026/2 min

I deploy autonomous agents into enterprise workflows for a living, so I watch a lot of AI demos. Most of them are beautiful — a clean prompt, a clean answer, a room full of people nodding. And most of them would not survive ten minutes inside an actual company.

The demo is the easy 80%. The 20% that's left is the entire job.

Where it breaks

A demo runs on a happy path: clean input, one system, nobody watching the cost. Production is the opposite. The data is dirty and inconsistent and three years out of date. The system you need to talk to has an API from 2009, or no API at all. The agent does the right thing 95% of the time, which sounds great until you realize the other 5% is the part that sends a customer the wrong number — and now no one trusts any of it.

None of that shows up in a demo. All of it shows up on day one of a real deployment.

The boring parts are the product

What actually makes an agent work in production isn't the model. It's the unglamorous scaffolding around it: the guardrails, the place where a human checks the risky calls, the logging you need when something goes sideways at 2am, the fallback for when the model is confidently wrong. The demo is the model. The product is everything you built so you can sleep while the model runs.

I've learned to be suspicious of anything that looks effortless. Effortless usually means someone moved the hard parts off-screen.

How I watch a demo now

When someone shows me something impressive, I don't ask what it can do. I ask what it does when the input is garbage. What happens when it's wrong, and who finds out. What it costs at a thousand times the volume. How you'd unwind it after it makes a mistake.

The answers separate the people who've shipped from the people who've only presented. The hype cycle rewards the demo. Reality only rewards the 20%. So if you want to know whether something's real, don't watch it work — watch what happens when it doesn't.

Kirtan Desai — @kirtandesai