Billing shock: corporations lose control over AI spending

A consultant's client burned $500M in one month from unlimited Claude access; Microsoft is force-migrating thousands of engineers off Claude Code by June 30, and Uber's CTO admits the company blew its entire 2026 AI budget in four months.

Author: Michael Kokin ·

One company just burned half a billion dollars in a single month because someone forgot to put limits on the API. I dug into who's spending what on AI at big companies, and what's actually happening with those hundred-million-dollar bills for coding agents.

Axios interviewed a well-known AI consultant. His client gave engineers unlimited access to Claude, and thirty days later a half-billion-dollar invoice showed up. Sounds like a joke about an incompetent manager, but corporate America is genuinely shaking from billing shock right now, and CFOs are slashing costs on an emergency basis.

Who's doing what

Microsoft is forcibly pulling Claude Code from thousands of engineers on the Windows, Office, and Surface teams by June 30, and moving them to its own GitHub Copilot. The official reason is security and control. The unofficial one: June 30 is their fiscal year-end. These guys understand better than anyone what tokens cost at industrial scale — Microsoft is one of Anthropic's largest customers. When a corporation takes away a tool its developers actually prefer over its own product, it's not about brand loyalty. It's about money.

Uber CTO Praveen Naga told The Information that the AI budget allocated for all of 2026 was gone in four months. Claude adoption among 5,000 engineers jumped from 32% to 84%, around 70% of commits now start with AI, and individual bills are hitting two thousand dollars a month.

> "I'm back to spreadsheets because the budget I planned is already gone" — Praveen Naga, Uber CTO

Amazon pulled third-party tools for everyone. 80% of staff was moved to the internal Kiro tool, and usage was baked into KPIs. Engineers argued Claude performed better. They were right. According to the Financial Times, an AI agent took down a production environment, two such incidents cost the company 6.3 million lost orders, and now senior engineers are reviewing code by hand.

Jensen Huang (CEO of Nvidia) says he'd be "deeply concerned" if an engineer earning $500k burned less than $250k worth of tokens in a year. They've invented a new sport called tokenmaxxing — a race to maximize token consumption. One CTO told Axios his people were asking the model for the weather just to hit their KPIs.

Salesforce is the rare example where the numbers actually work out. Marc Benioff (CEO of Salesforce) cut support headcount from 9,000 to 5,000, support costs dropped 17%, and Agentforce handles half of customer conversations with the same satisfaction scores as humans.

Some companies are covering the bills with layoffs. The CEO of CloudBees told Axios that headcount cuts are the only lever many companies have left to offset their AI spend.

Why the bill keeps climbing

Agentic mode makes the model think for hours at a time, spawning parallel sessions, re-reading huge chunks of code, self-checking — Opus 4.8 in dynamic workflow can spin up hundreds of agents.

The price per token keeps falling — roughly tenfold every eighteen months — but consumption is growing faster, so the bill keeps climbing anyway.

GitHub paused Copilot Pro registrations back in November because customers' agentic workloads were burning more than their subscriptions were worth. Starting June 1, fixed pricing is gone and everyone moves to usage-based billing.

Enterprise AI software in the US has gone up 20–37% in price over the past six months, per Tropic. Gartner is projecting $2.5 trillion in AI spending this year, up 69% — though only 28% of projects fully pay for themselves.

The new job on the market

A new role is emerging: LLM FinOps manager — a finance person whose budget unit isn't a headcount or a seat, but how much the model burned overnight. Success isn't measured by savings, but by demonstrated business value.

The era of free-for-all tokens is over — no more "let's give everyone unlimited access and see what happens." Now comes the era of quotas, limits, and a cost counter hovering over every engineer's shoulder.