Why Your AI Bill Is About to Look Very Different

The flat-rate subscription was the customer acquisition phase. What comes next is the bill.

May 25, 2026

The shock of the pivot to metered usage

In May, a chart from GitHub’s own Copilot pricing calculator went viral on LinkedIn. Posted by founder Joel Griffith from a Reddit user’s original screenshot, it showed $39 in current billing under the existing premium request pricing, against $5,851.77 projected under GitHub’s new metered system taking effect June 1.

A 150x increase, give or take.

The post and many like it went viral inside the developer community for the obvious reason: “Absolute insanity. None of this seems sustainable.”

Suffice it to say, the move to metered is significant, but the framing of “insanity” or “greed” shouldn’t lead away from the whole story. The pricing didn’t suddenly go up 150x, but rather the meter changed, from a flat per-request count to actual token consumption, and the previous meter was massively under-counting what users were burning through. GitHub wasn’t gouging. GitHub was, finally, charging something closer to what the service costs to run. (With some mitigation to ease the pain.)

Still, the online chatter largely presented an outlier user, the kind running multi-hour autonomous coding sessions on the most expensive frontier model available. But the same pattern is showing up at the enterprise level, with smaller multipliers but much larger absolute numbers.

Olli-Pekka Heinisuo, co-founder at GitHits, posted his own comparison: his team’s April usage cost $306 under PRUs (the flat “premium request” units of the old system) and would have cost $3,062 under AICs (the new “AI Credits” that meter actual token consumption). A 10x increase, not 150x, but on real enterprise usage from “a handful of users.” His projection for June, if his team continued using Copilot the way they had been, was a $10,000 to $20,000 monthly bill. He concluded the obvious: “after April, that no longer really makes sense.”

That’s the actual shape of the shift. Heavy individual users see catastrophic repricing. Enterprise teams see merely punishing repricing. And in both cases, the response (unless you are looking to token max and burn money) isn’t going to be paying the new bill. It’s going to be either dramatically reducing usage, switching tools, or routing around Copilot to the underlying model APIs directly. Heinisuo’s conclusion, that Copilot’s intermediary role no longer makes sense after April, is exactly the unbundling that the metering shift is going to trigger across the industry.

That distinction matters because what’s happening at GitHub is not a Copilot story. It’s the leading edge of a structural reset across the entire AI application economy. The era of all-you-can-eat AI subscriptions is ending, and the bill that arrives next year is going to look almost nothing like the bill that arrived this year.

I want to walk through what’s actually going on, why it was inevitable, and what it means for anyone who uses these tools, builds with them, or holds stock in the companies making them. I’ll also tie this to a research paper I wrote in March on Microsoft Copilot’s “super-appification,” which identified the precise structural tension now playing out in public.

If you want to skip around: the next section explains the mechanics. The one after that is the political economy. Then there’s a section on what this means for the competitive landscape, including OpenAI, Anthropic, and the pure application-layer companies like Cursor. The final section is on what individual investors and tech workers should actually take from this.

What changed: from requests to tokens

To understand the magnitude of the shift, it helps to understand what the old meter measured versus the new one.

Under the legacy GitHub Copilot system, you paid for “premium requests” (PRUs). A PRU was a unit of work, priced at $0.04, with each subscription tier carrying a monthly allotment. Pro got 300, Pro+ got 1,500. The catch was that a “request” was a flat unit regardless of what it actually cost GitHub to fulfill. A single autocomplete suggestion was a request. A multi-hour autonomous agent session that read fifty files, ran tests, debugged failures, and wrote a pull request was, depending on how it was structured, also one request, or maybe a handful.

This was fine when Copilot was an autocomplete tool. It became absurd when Copilot became an agent.

The new system meters AI Credits (AICs), priced at $0.01 each, against actual token consumption: input tokens, output tokens, reasoning tokens, tool calls, the whole stack. An autocomplete still costs almost nothing. An agent session that previously registered as one PRU might now meter as 5,000 or 50,000 AICs depending on which model it called and how many round trips it made. GitHub’s own calculator confirms the math: the same workload that registered as $39 of PRU-based billing in April reprices as $5,851 of AIC-based billing.

The Copilot CPO, Mario Rodriguez, said the quiet part out loud in the announcement: “GitHub has absorbed much of the escalating inference cost behind that usage, but the current premium request model is no longer sustainable.”

“Absorbed” is certainly doing a lot of work in that sentence. What it means in practice is that Microsoft, GitHub’s parent company, was carrying the gap between subscription revenue and actual GPU consumption on its consolidated balance sheet. As long as Copilot was selling autocomplete, the gap was manageable. Once it started selling agents that could spend hours of Claude Opus 4.7 inference time on a single task, the gap became a structural loss center.

The new metering doesn’t just close that gap. It also makes the cost of agentic AI legible to the buyer for the first time. Under flat-rate pricing, you didn’t know that your morning of vibe coding cost $300 in compute, because the bill was always $39. Under metered pricing, you watch the meter run. (That may dent the magic of vibe coding somewhat.)

That legibility is, I think, the most underrated consequence of this shift. We’re about to find out what AI actually costs. Compounded with research that points to frequent failures with internal AI pilot projects (a non-technical employee is suddenly tasked with building a complex retrieval-augmented generation system, for example) the landscape seems ripe for conflicting, and ultimately misleading signals.

Why the gap existed in the first place

The flat-rate model was never built for inference economics. It was inherited from the SaaS playbook of the last twenty years, which was built on a foundation that simply no longer applies.

In classic SaaS, the marginal cost of serving one more user is essentially zero. The same Salesforce instance, the same Gmail backend, the same Slack workspace serves a hundred thousand users or a hundred million with the same underlying infrastructure cost per user, more or less. Seat pricing made sense because seats were the thing that varied while costs stayed flat.

Generative AI breaks this. Every prompt is a discrete physical event that consumes GPU cycles, electricity, cooling, network capacity. The marginal cost of an additional user is not zero. The marginal cost of an additional user running an agent is, depending on the model and the workload, somewhere between meaningful and ruinous. Anthropic’s Claude Opus 4.7, for instance, can burn through hundreds of dollars of compute on a single multi-hour coding session.

That’s a 25x loss per heavy user. And those are precisely the users the entire industry has been telling you to become. “Agentic” workflows, “autonomous coding,” “AI teammates,” all of the marketing language of the past eighteen months has been pushing users toward exactly the consumption patterns that make flat-rate pricing impossible.

Why did the companies do this anyway? Two primary reasons, I reckon.

The first is competitive. Once one major player offered flat-rate access to frontier models, everyone had to. Cursor priced aggressively because it was trying to take share from GitHub Copilot. GitHub priced aggressively because it was trying to defend against Cursor. Anthropic and OpenAI kept Pro tiers cheap because they were trying to win consumer mindshare against each other. The pricing was set by competitive dynamics, not by unit economics.

The second is strategic. Capturing developer mindshare during the inflection moment of generative AI was worth burning capital. Microsoft could afford to lose money on Copilot for years if it meant locking in the position as the default AI coding tool by the time the market matured. Anthropic could afford to subsidize Claude Code as long as it was the leading edge of agentic adoption. The flat-rate phase was, explicitly, customer acquisition funded by deep pockets and venture money.

The strategic phase ends when the bill comes due. For Microsoft, the bill is the $34.9 billion in quarterly capital expenditure required to build “planet-scale cloud and AI factories,” which Amy Hood justified to investors on the Q1 FY26 earnings call. That spending needs returns, and returns require monetization that scales with consumption rather than seats.

This is where I want to pull in some research I did earlier this year, because the tension I just described was already visible if you knew where to look.

The whitespace problem

In March, I wrote a research paper on Microsoft Copilot’s evolution from a chatbot into a “super app,” contrasting how the product was marketed to consumers against how it was sold to investors. The thesis, in one sentence: Microsoft was telling consumers and investors radically different stories about Copilot, and the gap between those stories was becoming structurally untenable.

On the consumer side, Copilot was marketed as a frictionless, personalized digital companion, perfect for brainstorming, quick questions, or “just venting.” The App Store descriptions emphasized simplicity, intimacy, ease of use. The massive cloud infrastructure powering it was deliberately obscured.

On the investor side, the same product was framed as the demand-generation layer for Azure’s $34.9B quarterly capex. The operative metric in Microsoft’s internal Partner Center documentation was “whitespace,” which is worth pausing on because it’s the term that ties this whole shift together.

Whitespace refers to the gap between users who have access to a product and users who actually pay for it. For Copilot, the math was striking: 450 million commercial users with some form of access, 3.3% paid adoption, a 435 million seat gap that Microsoft executives discussed openly as “Free to Paid Whitespace” on the investor dashboard. Every quarter, Amy Hood faced shareholders who wanted to know when that whitespace would close, when those 435 million users would start generating revenue commensurate with the $34.9 billion being spent each quarter to build the AI infrastructure they were running on.

The paper’s central finding was that this dual rhetoric couldn’t hold. The consumer story sold simplicity. The investor story demanded monetization. Something had to give. I documented the April 15, 2026 paywall shift, when Microsoft severely limited free Copilot access inside Word, Excel, and PowerPoint, as the moment when the financial mandate started bleeding through into the user experience.

What I didn’t fully anticipate, writing in March, was how quickly the same logic would spread to GitHub Copilot’s developer product, and from there to the broader industry. The April paywall was Microsoft saying “free users will become paid seats.” The June AIC system is Microsoft saying “paid seats will become metered consumers.” Both moves are downstream of the same whitespace pressure. The whitespace didn’t close fast enough through seat conversion, so Microsoft is closing it through metering, which has the additional benefit of having no per-user ceiling. A heavy Copilot user can now generate $5,000 of revenue instead of $39, without any change in the underlying product.

On the Q3 FY26 earnings call, Satya Nadella made the strategy explicit: “Any per user business of ours, whether it’s productivity or coding or security, will become a per user and usage business.” That’s the policy version of the structural argument my paper was making. The entire seat-based SaaS model is being publicly retired by the largest enterprise software company in the world.

One last thread from the paper is worth pulling forward, because it explains how the metering layer becomes the new rent extraction point. Platforms increasingly decentralize at the application layer, inviting third-party plugins, agents, and integrations into their ecosystem, while centralizing at the infrastructure layer. The AIC system fits this pattern precisely. GitHub now lets you choose Claude, GPT, Gemini, or open models from inside Copilot. Decentralization at the model layer. But GitHub controls the metering, the multipliers, the AIC pricing, and the tollbooth at which every model call passes through. Centralization at the rent-extraction layer.

The point I want to make here is just that the GitHub repricing isn’t a surprise event. It’s the predictable next step in a pattern that’s been building for at least a year, driven by a whitespace gap that seat-based monetization could no longer close.

The competitive shakeout

The metered shift hits different parts of the AI stack very differently. This is where it matters for anyone watching the industry as either a user or an investor.

Vertically integrated players (Microsoft, Google, and to a growing extent Anthropic) come out ahead. They own multiple layers of the stack: infrastructure, models, applications, distribution. When metering reveals the true cost of inference, they can absorb it internally or pass it through with margin intact. Microsoft especially benefits because metering reframes Azure from a cost center supporting Copilot into the visible revenue source for AI consumption. The Nadella quote isn’t just about Copilot. It’s about restructuring every Microsoft product so that Azure consumption shows up explicitly on the bill.

Pure model providers (OpenAI most acutely, also Mistral, Cohere) are in a more complicated position. They own the model but not the infrastructure. OpenAI’s deepest cloud and IP entanglements remain with Microsoft, even as Microsoft increasingly competes with OpenAI directly through its own MAI models and through Copilot’s model-routing flexibility. The metering shift forces OpenAI to either renegotiate that relationship, diversify infrastructure (which they’re attempting through Oracle, CoreWeave, and the Stargate project), or watch Microsoft capture more of the value chain.

Anthropic, by contrast, has spent the past year building infrastructure optionality. The recent xAI deal, in which Anthropic effectively leased Elon Musk’s entire Colossus 1 data center in Tennessee, gave them 300 megawatts of immediately-available compute and 220,000 Nvidia GPUs. That’s on top of a 5 GW commitment from Amazon, a 5 GW commitment from Google, and a Broadcom partnership for custom silicon. Anthropic is now structurally less dependent on any single hyperscaler than OpenAI is, which gives them more pricing flexibility and more capacity to grow into the metered era without supply constraints. Their $30B annualized run rate, up from $9B at the end of 2025, suggests this is working.

Application-layer wrappers (Cursor, Windsurf, Cline, the broader category of “Claude-powered” and “GPT-powered” tools) are in the worst position. They own nothing structural. They pay Anthropic or OpenAI near-retail API rates and resell access wrapped in a nice IDE. The flat-rate model was their entire moat against the model providers and the hyperscalers. The metered shift removes that moat. The $20/month Cursor plan against $5,000 of Claude consumption is not a pricing strategy, it’s a venture-subsidized customer acquisition play with no clear path to unit economics. Watch for one of these companies to be acquired by a vertically integrated player within the next 18 months, or to dramatically reprice and lose users.

There’s also a reported $60B option for SpaceX to acquire Cursor by late 2026, which I’d read less as a vote of confidence in Cursor than as Elon Musk hedging against xAI’s competitive failure. Either way, the application wrapper category is consolidating, and metering is the proximate cause.

What this means if you use these tools

The simplest version: budget for usage, not seats. The $20/month subscription you’re paying now is going to be replaced, on a timeline measured in months rather than years, by metered billing where heavy users pay hundreds or thousands of dollars and light users pay less than they do now.

If you’re a developer using AI coding tools daily, you should expect:

Your tools will start showing you a meter. GitHub’s AIC system, Anthropic’s usage dashboards, OpenAI’s credit tracking. The legibility shift is happening fast.

Heavy agentic workflows will become noticeably expensive. The same task that “used to be free” will start showing a cost. This isn’t a bug, it’s the entire point.

Model choice will become a cost decision. If Claude Opus 4.7 costs 27 AICs per request and GPT-5.4 costs 6, you’ll start routing simpler tasks to cheaper models. Tool vendors will help you do this automatically. The “always use the best model” reflex is about to become unaffordable.

Enterprise contracts will start looking different. The flat-rate seat license is being replaced by a hybrid: base seat fee plus metered consumption pool plus overage rates. Procurement teams who haven’t dealt with consumption-based AI billing will get an unpleasant education.

If you’re not a developer but you use AI tools through M365, ChatGPT, Claude, or Google’s products: the same shift is coming, just slower. Nadella’s “every per user business” statement is the timeline. Expect your AI features to start having explicit limits, expect those limits to become metered, and expect the bill to start reflecting your actual usage rather than a flat fee that hides it. Is that an apocalyptic scenario? Well, it depends on how one looks at it. For retail users, one might argue it’s more akin to the end of using a large Hadron Collider to vibe code a daily habits app.

How bad is it actually

A few moderating observations are worth making, because the posts on LinkedIn and on dev Reddit driving the public reaction are unrepresentative in specific ways.

The user staring at a $5,851 projected bill was running an outlier workload, almost certainly long autonomous coding sessions on the most expensive frontier model available, probably overnight, probably with redundant tool calls. That’s not the median Copilot user. That’s a user the subsidy was bleeding hardest to support, and arguably should have been priced differently from the start. Heinisuo’s $306-to-$3,062 case is more representative of a sophisticated enterprise team but still selected, his company builds coding agents, so his developers’ usage profile is by definition heavier than a normal enterprise dev team’s.

A median Microsoft enterprise customer using Copilot for autocomplete, occasional chat, and the occasional agent run is likely to see a much more modest increase, maybe 1.5x to 3x, possibly absorbed within existing AIC allotments at higher tiers. GitHub has telemetry on every user’s actual consumption. The new pricing wasn’t set in a vacuum, it was modeled against existing usage distributions to capture revenue from heavy users without driving away the median. The $5,851 user was, from GitHub’s perspective, the intended casualty.

There’s a related dynamic worth flagging. Research on internal enterprise AI pilot projects shows high failure rates, with non-technical employees being tasked to build complex systems (retrieval-augmented generation, agent orchestration, custom fine-tunes) that they aren’t equipped to design or maintain. A lot of the consumption that was happening under flat-rate subsidy was, frankly, badly-designed projects burning compute on unsuccessful pilots. The metering shift exposes that cost at the exact moment enterprises are realizing many of their AI initiatives aren’t generating returns. Expect a wave of “AI is too expensive” conclusions over the next year that are actually conclusions about implementation quality rather than about the tools themselves.

The longer arc also matters. Token costs on the supplier side are still falling. Anthropic’s Sonnet, for example, is dramatically cheaper per token than Claude 3 Opus was 18 months ago, and the trajectory continues. If model efficiency improves 2–3x per year and GPU economics improve another 1.5–2x annually through Blackwell, custom silicon, and inference optimization, the absolute cost of a typical agent run two years from now could be a fraction of what it is today. The metering shift is probably better understood as a one-time legibility shock followed by gradual price normalization, not a permanent step-change in what AI costs.

So if you’re asking whether the median AI tool user is about to face $5,000 monthly bills, the answer is no… probably not. The median will normalize. The heavy edge users will either pay up or change their workflows.

What is permanent, though, is the control shift that the metering establishes. Once GitHub has built the AIC infrastructure, the AIC-to-dollar conversion is a lever they can pull at any time. Today it’s $0.01 per AIC. They can move it to $0.012 quietly, adjust multipliers on specific models, change the included allotments per tier, and most users won’t notice until their bill arrives. Seat pricing was sticky and visible. Metered pricing is variable and opaque. That’s a durable shift in power between the platform and its users that doesn’t depend on where any specific price point lands.

The other durable consequence is the unbundling pressure Heinisuo’s post hinted at. The flat-rate era hid the fact that GitHub was charging a markup to route you to Anthropic and OpenAI. The metered era exposes it. Sophisticated enterprise buyers seeing $3,000 monthly Copilot bills will start asking why they don’t just pay Anthropic $3,000 directly. Anthropic’s enterprise sales team is having exactly that conversation right now with companies receiving repricing notices. So even if GitHub’s price levels turn out to be reasonable, the visibility itself accelerates a competitive dynamic that pulls revenue away from the application layer and toward the model providers and infrastructure owners.

The honest read, then, is that the metered era is not the end of accessible AI tools, but it is the end of unconscious AI consumption. From now on you’ll know what you’re using and what it costs, and the companies that own the meter will know it too, in much finer-grained detail than they did before.

What this means if you hold the stocks

The metered shift has clear implications for AI-exposed equities depending on where the company sits in the stack. This is not an exhaustive market analysis, but rather my contribution in this specific context.

The vertically integrated incumbents (Microsoft, Google, Amazon) are positioned to extract more revenue per user as metering takes hold, because they own the infrastructure layer where the meters live. Their margins on AI services should improve as the subsidies wind down. The risk is regulatory, given that the metering layer is also where the rent-extraction concentration becomes legible to antitrust authorities.

Anthropic isn’t public, but its position is interesting for anyone watching Amazon, Google, and (now) SpaceX’s AI strategies. Anthropic’s compute diversification reduces its dependence on any single hyperscaler, which means the value it generates flows to multiple infrastructure providers rather than enriching one platform. Amazon and Google both get to claim Anthropic as an anchor tenant for their AI infrastructure investments. The xAI deal adds a third beneficiary: SpaceX, which is reportedly absorbing xAI ahead of a public offering and badly needs Colossus 1 to show revenue rather than just GPU count. An Anthropic anchor tenant turns that asset from a money-burning training cluster into a cash-flowing inference business, which materially improves the story SpaceX will tell its S-1 readers.

OpenAI’s current position seems the most precarious of the model providers. The Microsoft relationship is increasingly adversarial in ways that the financial structure can’t fully accommodate, and OpenAI’s effort to diversify infrastructure (Oracle, Stargate, CoreWeave) is happening under time pressure. Watch for a major restructuring of the Microsoft-OpenAI relationship within the next 12–18 months, possibly in the form of OpenAI buying out parts of the original investment or Microsoft monetizing its equity at a peak valuation.

The pure application-layer category (Cursor and others, which are private but proxied through valuations and venture market sentiment) is facing a margin compression event that no amount of growth can fix. The companies that survive will do so by being acquired into a vertically integrated stack or by integrating downward themselves, building models, fine-tuning, or owning infrastructure. The companies that don’t will quietly shut down or merge.

For the broader market, the metered shift is the structural mechanism by which the $2.9 trillion in projected global data center construction by 2028 (Morgan Stanley’s estimate) gets monetized. The capital expenditure cycle that’s been driving Microsoft, Google, Amazon, Meta, and Oracle through 2025 and 2026 only generates returns if metered AI consumption scales to fill the capacity. The flat-rate phase was a bet that demand would eventually justify the build. The metered phase is the mechanism for actually capturing that demand as revenue.

If you believe agentic AI is going to be a major workflow shift across the global economy, the metered companies are well-positioned. If you don’t, the capex bill comes due against insufficient revenue, and the equities that are most exposed are the ones whose stories most depend on continued AI revenue acceleration.

The flat-rate moment was always going to end

Consider that unlimited mobile data was a beautiful era. For a few years in the 2010s, you paid your carrier a flat fee and you used as much data as you wanted. The carriers absorbed the cost because they were fighting for subscribers and the marginal cost of bandwidth was still falling. Then the cost stopped falling, or rose, depending on the metric you used, and the unlimited plans got capped, throttled, tiered, and eventually replaced by usage-based billing dressed up as “tiered unlimited” with caps on HD video and other bandwidth sucking activities.

The flat-rate AI subscription is in roughly the same place mobile data was in 2014. The economics worked while the underlying cost was being subsidized by competitive pressure and venture capital and hyperscaler ambition. When those subsidies end, the meter starts running.

What’s distinct about this moment versus the mobile data shift is the speed of the transition. Mobile data took six or seven years to move from unlimited to metered. The AI flat-rate era launched in 2022 with ChatGPT Plus, peaked in 2024 with Copilot and Cursor, and is being structurally dismantled in 2026. Three and a half years from start to repricing.

That speed is itself a tell. The unit economics never worked. The subsidies were holding back a much larger cost than the consumer experience suggested. And the moment when the largest companies in the industry, Microsoft, Anthropic, OpenAI, all begin shifting in the same direction within weeks of each other, is the moment when you can stop calling it a pricing change and start calling it a structural reset.

The shock isn’t that GitHub repriced. The shock is what the new price reveals, which is partly real and partly designed. The metered calculator that shows you a $5,851 projected bill also shows you an upgrade prompt for a higher tier that would absorb most of it. The subsidy isn’t ending so much as being repackaged as a premium feature, where what you’re paying for is the comfort of not watching the meter run. That’s the durable shift. From now on, the question isn’t what AI costs. The question is what your provider has decided to let you see of what it costs.

Thanks for reading! This post is public so feel free to share it.

My March 2026 paper “Selling the AI Super App: A Historical Analysis of Microsoft Copilot’s Rhetoric Across App Stores and Financial Disclosures” underlies much of the structural argument in this post.

Leandro Oliva

Discussion about this post

Ready for more?