All notes
·4 min read·Suraj Malthumkar

Cost-per-run is the only number your CFO actually wants

Most agentic-AI pilots skip the unit economics. Here's the napkin math that decides whether your project pays back in ten days or leaks for a year.

We've sat through a lot of AI steering committees in the last twelve months. Almost none of them opened with a number. They opened with a vision deck, a vendor demo, and a slide titled "AI Roadmap." Then someone from finance asked what the thing costs to run, and the room got quiet.

The CFO is not being difficult. The CFO is asking the only question that matters once the demo ends: what does each call cost, how many calls per month, what's the marginal value per call, and where's the breakeven. If you can't answer those four on a napkin, you don't have a project. You have a science fair.

The four numbers

Cost-per-run. Volume per month. Value per run. Breakeven volume.

That's it. Every agentic workflow we've shipped in the last year sits on those four numbers, and every pilot that died sat on a deck that didn't have them.

Cost-per-run is the all-in marginal cost of one execution: model tokens, retrieval, vector store reads, any external API the agent calls, plus the amortized infra. Not the engineering cost. Not the sunk cost of the platform. The marginal cost of run number 50,001.

Volume is honest forecasted volume, not the addressable-market number on slide 14. If you process 12,000 tickets a month today, your volume is 12,000. It is not 40,000 because someone says "and then we expand to EMEA."

Value per run is the dollar value of the outcome, conditional on the agent succeeding. A deflected support ticket worth $7.20 in fully loaded human handle time. A correctly enriched lead worth $0.40 in SDR research time. Pick the dollar figure your finance team already uses.

Breakeven is value times success rate, divided by cost. If that's greater than one, you have a project. If it's less than one, you have a hobby.

A worked example

Series-B B2B SaaS, 12,000 support tickets a month. Average handle time on tier-one tickets is twelve minutes at a fully loaded $36/hour, so each ticket costs about $7.20 to resolve.

Agent triage runs at $0.04 per ticket all-in. That's the model call, retrieval over the help center, and a write to the ticketing system. The agent deflects 38% of tier-one tickets cleanly and escalates the rest with a draft response attached.

Run the math. 12,000 tickets times $0.04 is $480 in agent cost per month. 38% deflection is 4,560 tickets fully resolved without a human, worth $32,832 in reclaimed handle time. The remaining 7,440 tickets get a draft response that cuts agent handle time by an estimated 90 seconds each, worth another $6,696. Total monthly value is roughly $39,500 against a $480 run cost.

Payback on the $18k engagement fee is under two weeks. The CFO will sign that. The CFO will not sign "we believe AI will transform support."

Where the leak hides

The same workflow priced wrong is a money pit, and the leak is almost always in two places.

The first is model selection. A team picks the biggest model on the menu because the demo was crisp, and now every triage call costs $0.18 instead of $0.04. At 12,000 tickets a month, that's $2,160 in run cost instead of $480, and the breakeven shifts by a factor of four. Most of the time the smaller model gets the same deflection rate inside a one-percentage-point margin. Test both. Always.

The second is retrieval. Naive retrieval pulls 20 chunks per call when 4 would do. That's 5x the input tokens on every run, and input tokens dominate cost on most workflows. A two-day pass at the retrieval pipeline routinely cuts cost-per-run by 40% with no measurable quality drop.

If you haven't measured tokens-per-run on a representative sample, you do not know your cost-per-run. You know your vendor's marketing claim.

What we ask before signing the SOW

We won't take a project if the customer can't answer three things in writing. What's the workflow's current dollar baseline. What's the smallest unit you can price (per ticket, per lead, per invoice). What's the success criterion that finance will accept as deflection.

If those three exist, we can do the napkin math in an hour and tell you whether the project is a ten-day payback or a slow bleed. If they don't exist, the first week of the engagement is finding them, and we'd rather you find them before you pay us.

The discipline

Cost-per-run is a discipline, not a metric. It forces you to name the workflow, name the baseline, name the unit, and name the failure mode in dollars. Teams that do this ship. Teams that don't ship a pilot that no one can defend in the next QBR.

If you can't write the cost-per-run on a napkin, don't sign the SOW. Send it back. Ask for the four numbers. If the vendor can't produce them, the vendor doesn't know what they're selling you, and you're about to pay to find out.