Last Friday, Anthropic published Project Deal. They handed 69 of their employees over to AI agents for a week, gave each agent a $100 budget, and turned them loose on an internal classified marketplace. The agents made 186 deals worth a little over four grand. They haggled over snowboards, sold a broken folding bike, and at one point one agent bought a bag of 19 ping pong balls as a present to itself.
Read it. It's funny. It's also the loudest signal yet that agent-to-agent commerce is about to be a real thing, and nobody has built the safety layer underneath it.
What they actually proved
Three things from the writeup matter if you run a medspa, dental practice, or any health business that takes bookings.
First, agents can run transactions end to end with no human in the loop. Nobody approved a single deal while it was happening. The agents found matches, proposed prices, fought over counteroffers, and shook hands on it, all in plain English. Anthropic dropped a footnote that says "this doesn't reflect how we think agents should be deployed in the real world." Sure. But it's how they're going to get deployed the second Google UCP, AP2, and OpenAI's ACP hit production.
Second, when a smart agent goes up against a dumb one, the dumb one loses money. Anthropic ran half the marketplace with their flagship Opus 4.5 model and the other half with the smaller Haiku 4.5. Opus sellers pulled in $2.68 more per deal on average. Opus buyers paid $2.45 less. The same broken folding bike sold for $38 when Haiku ran the negotiation and $65 when Opus ran it. Same bike. Same buyer on the other side. A 70% swing based purely on which model the human happened to get.
Third, and this is the part that should keep healthcare operators up at night: the people who got the worse deals had no idea they got the worse deals. They rated their experience just as fair as the people who came out ahead. The disadvantage was completely invisible to them.
Then there's a footnote I'd argue is actually the lede for anyone in our space:
"These confabulations illustrate the potential risks of implementing a system like this in a non-experimental setting without additional safeguards."
In Project Deal that meant an agent making up a story about a "conversation-starting chair situation" while haggling. In a Botox booking flow, the same confabulation habit means an agent making up an answer to "are you currently pregnant" because the patient's profile doesn't say one way or the other and the agent figured it could fill in the blank.
Why this stops being funny in healthcare
A bad ping pong ball trade costs three bucks. A bad filler appointment costs a lawsuit.
Botox and dermal fillers carry hard contraindications. Pregnancy. Blood thinners. Active skin infection. A recent facial procedure. Allergies to specific compounds. The intake form is not a formality. It exists because skipping it has produced actual harm to actual patients.
Now picture the booking flow. A patient asks their agent to book Botox. The agent finds your practice through Google UCP, which is already live. The agent needs to handle intake. Somebody has to ask the contraindication questions, check the answers, and turn the booking down when the answers don't clear the bar. The protocols don't do this part. UCP tells agents how to discover services and reserve slots. It is silent on how to run a clinical intake.
That silence is the gap CommerceSafe was built to fill.
The gate is the moat
The most uncomfortable finding from Project Deal, for anyone betting on agentic commerce, is that users can't tell when their agent has failed them. Now run that idea through a healthcare lens. A patient whose agent makes up an intake answer, slides the booking through, and walks into your practice has no idea anything went wrong. Neither do you, until something does.
Three things need to be true for agentic booking to be safe.
The intake check has to be deterministic. Not "the agent should probably ask." A real, schema-validated, audit-logged check that fires on every single reservation, regardless of which model is on the other end of the wire.
The check has to live on your side. The patient's agent has a wallet and a set of instructions. It does not have your standard of care. The validation belongs in your booking system, not theirs.
The refusal has to be machine-readable. When an agent submits intake answers that fail your rules, your system has to tell the agent exactly why, in a structured form the agent can hand back to the patient. "Booking denied, pregnancy contraindicates this service" is the agent equivalent of a front-desk staff member calling someone back.
That's the work CommerceSafe's /api/v1/[businessId]/intake-check and /api/v1/[businessId]/reserve endpoints actually do. The gate runs before any reservation completes. Every failure is structured, logged, and explainable.
Asymmetric agents, asymmetric liability
Project Deal's price-asymmetry finding has a healthcare analog that's a lot scarier than a couple of dollars per deal.
Patient A uses a top-tier agent. Patient B uses a free, weaker one. Both book the same procedure at your practice. The difference between them isn't $2.45. The difference is which one's agent actually caught the contraindication question and answered it honestly.
If you trust the patient's agent to run intake, you are gambling on whichever model the patient happens to be using. If you run the compliance check server side, you don't care what's on the other end. The check is the check.
This is the same lesson every web developer learned a decade ago about client-side validation. We just haven't pulled it forward into the agent era yet. That's all CommerceSafe is doing.
What this means for your practice
If you take appointments for a service with any kind of intake requirement, medspa, aesthetics, dental, weight loss, hormone therapy, anything regulated, agentic booking is coming whether you're set up for it or not. Google's AI Mode is already showing booking options in search results. ChatGPT operators are already hitting APIs. Patients will start telling their agents to book before they pick up a phone.
Practices with a compliance gate in front of their booking system will take that traffic safely. Practices without one will take it too, and find out about the failures the hard way.
Project Deal was Anthropic's pilot. The healthcare version is going to happen with or without a pilot. The only question is whether you're the practice that ran it carefully, or the one that found out about a confabulated intake answer from a malpractice claim.
If you take bookings and you'd rather not find out the second way, start your assessment.

