Agent-controlled Apps
I have been building an appliance energy simulator, and I wanted to share an exciting possibility about the future of user interaction with you.
The app lets me tune appliance behavior directly: runtime hours, usage patterns, and so on. It is useful, but manually simulating a real day (like a winter day in Northern Utah) can be tedious. You have to touch a lot of controls, make assumptions, and iterate.
Doing this would be way too much for a user to do manually, and I never would have built this except I knew my audience wasn't human users, but agents.
So instead of manually adjusting every appliance, I can ask:
"Could you simulate a winter day in Northern Utah? Make your best guess and update the simulator."
And the agent does it. It reasons about usage, updates the live app, and gives me a realistic scenario in seconds.
That interaction is the magic. You don't just say "show me this thing" and exit the chat world to enter the app world. They live in the same space!
The core idea is simple: Put your app in the agent.
Once your app is connected to MCP tools, users can delegate meaningful UI actions to an agent. Instead of clicking through dozens of controls, they can describe intent and let the agent drive the simulation.
That means faster experimentation, better accessibility, and a much more expressive user experience.
How it works today (and why it is a little painful)
Right now, my setup looks like this:
- The app runs within the ChatGPT conversation (in full screen mode).
- It keeps a WebSocket connection to my Cloudflare worker.
- That Cloudflare worker also serves my MCP server.
- ChatGPT talks to that MCP server.
- The MCP server triggers updates back to the UI.
It works today. It is real and surprisingly powerful.
But it is also more plumbing than it should be.
Why I am excited about WebMCP
The future is better: the agent should be able to talk directly to the app in its own context.
That is what WebMCP unlocks.
Less indirection, fewer moving pieces, and a much cleaner mental model for developers. More importantly, it makes the "agent controls app" pattern easier to ship and easier to trust.
And this is not just about ChatGPT. Standards matter because they let this work across clients and platforms. When multiple tools can speak the same protocol, your app becomes more portable and your users get more choice.
You can build this now
You do not have to wait.
You can already ship this experience today with some extra setup. If you are building interactive tools, dashboards, or simulators, this pattern is worth trying now.
I think this is one of the most compelling UX shifts in front of us: interfaces that users can drive directly, and interfaces agents can drive on their behalf.
That is where this is heading, and I am very excited about it.