What DeepSeek’s 75% Price Cut Says About the Real Cost of AI

Last week’s question was what AI costs. This week’s answer came from Hangzhou. The Western reading of it deserves a careful look.

May 26, 2026

Last week I wrote about why your AI bill is about to look very different, arguing that GitHub’s repricing was the leading edge of a structural reset across the AI application economy. The piece focused on the demand side: what happens when flat-rate subsidies end and the meter starts running. Everyone, finally, sees what inference actually costs.

On Saturday, DeepSeek answered the same question from the other side of the meter.

The Hangzhou-based lab announced it is making permanent a 75% price cut on its flagship V4-Pro model, bringing API costs to between roughly $0.0035 and $0.87 per million tokens, depending on usage type. When V4 launched a month ago, DeepSeek said Pro pricing was up to 12 times higher than Flash because of “constraints in high-end compute capacity,” tied to limited supply of Huawei’s Ascend 950 — the domestic AI accelerator chip that Chinese labs use as a workaround for the Nvidia GPUs that US export controls have kept out of the country.

That gap just compressed. Using the new V4-Pro prices against DeepSeek’s V4-Flash rates, the multiplier has narrowed from 12x at launch to roughly 4x today. (My math, not theirs.)

For context, and per a Cryptopolitan breakdown of the comparison: Claude Opus 4.7 costs about $5 input and $25 output per million tokens. GPT-5.4 sits at $5 and $15. DeepSeek V4-Pro is now $0.44 input and $0.87 output. The output-cost gap alone is roughly 9x against Anthropic and 17x against OpenAI’s prior-generation model. An autonomous agent that costs a few hundred dollars a day to run on Claude Opus runs for under $40 on V4-Pro.

The reading that’s circulating

A specific interpretation of this news has gained traction across business press and industry commentary: that DeepSeek is running a loss-leader strategy. Price below cost now, capture market share, raise prices once the user base is locked in. It’s the playbook attributed to Amazon in retail, to plenty of Silicon Valley unicorns in their growth phases, and increasingly to Chinese tech companies in their export pushes. The reading is sophisticated enough that it deserves to be taken seriously rather than dismissed.

Whether it actually fits DeepSeek is a question worth working through, especially when AI marketing, its sound bytes and narrative construction, is itself heavily shaping news coverage. The structural features of this specific case point in directions the standard template doesn’t quite capture.

What supports the loss-leader reading

There are real reasons the framing keeps surfacing.

DeepSeek is closing its first external funding round. The Financial Times and Bloomberg have reported that the lab is raising $3 billion to $4 billion at a valuation that has climbed from around $20 billion in late April to roughly $45 billion by mid-May. Most consequentially, the round is being led by the China Integrated Circuit Industry Investment Fund — known as the “Big Fund,” a state-backed vehicle established to advance China’s semiconductor self-sufficiency. This would be the Big Fund’s first investment in a Chinese AI model lab, which analysts read as Beijing treating frontier AI software and domestic chip production as a single strategic problem. Aggressive pricing is a growth story, and a growth story at this scale is what state-led funding rounds buy.

Anthropic and OpenAI have previously accused DeepSeek of training on Claude outputs through what it called “distillation attacks.” DeepSeek disputes the latter, but if any part of it is true, some of the cost advantage would be coming from borrowing capability from competitors rather than from pure architectural efficiency.

DeepSeek did not explain the permanent price cut. The Reuters story notes the company declined to say whether it was tied to increased Ascend 950 supply; in other words, whether Huawei is finally able to ship enough of its domestic AI chips in volume to lower DeepSeek’s per-query inference costs, which was the reason DeepSeek itself gave at launch for the original Pro premium. The silence on this question leaves room for a strategic reading rather than a unit-economics one.

And then there’s the structural subsidy question. State-level support for Chinese AI champions, combined with access to domestic compute that doesn’t price against Nvidia retail markups, plausibly allows margin absorption that Western labs can’t match. The Big Fund leading this round makes that less hypothetical than it would have been a month ago.

These are not nothing.

What pushes against it

The pattern of price cuts predates anything that looks like a market-share play. According to a published history of DeepSeek’s roadmap, each model generation since V2 in May 2024 has shipped a specific efficiency improvement that translated more or less directly into a lower per-token price. V2 introduced an attention mechanism that compressed memory usage during inference. V3.2-Exp in late 2025 introduced sparse attention and cut prices in half. V4 introduces what DeepSeek calls Hybrid Attention Architecture. The technical specifics are less important than the pattern: pricing tracks engineering improvements rather than the funding cycle. Founder Liang Wenfeng has said the team was surprised by the market’s price sensitivity when V2 launched — they were pricing against actual costs, not strategically.

The weights are open. V3 was released under DeepSeek's own license, and R1, V3.2-Exp, and V4 Pro were all released under the MIT License — one of the most permissive open-source licenses, which lets anyone download, modify, run, and even commercially deploy the model with essentially no restrictions. This is the structural fact the standard loss-leader template doesn’t accommodate.

Loss-leading works when users get locked into the platform through switching costs, ecosystem dependencies, or proprietary data. None of that applies to a model anyone can download from Hugging Face and run on their own infrastructure. If DeepSeek raises prices later, sophisticated users self-host or move to one of the dozens of third-party providers already hosting earlier DeepSeek models at competitive prices. The classical moat isn’t there.

The V2 precedent is worth dwelling on. When DeepSeek aggressively priced V2 eighteen months ago, the structural outcome wasn’t market capture. It was commoditization. The Chinese AI market got dramatically cheaper, the Western labs eventually had to follow, and DeepSeek’s enterprise market share in the West remains at roughly 1% on the most recent Menlo Ventures data, against Anthropic at 32% and OpenAI at 25%. If V2 was the opening move of a lock-in strategy, the strategy has so far produced a price war without producing the capture.

And then there’s the unusual corporate structure. DeepSeek’s parent is High-Flyer, a quantitative hedge fund that has financed the lab entirely off its own balance sheet since 2023. The lab is organized around small project teams rather than the traditional management hierarchies of Chinese tech giants, and it has explicitly described itself as research-first.

The geometry of “burn capital now to monetize later” assumes a capital structure built around eventual monetization at scale. A hedge-fund-financed research lab that publishes open weights and prices against its actual unit costs has been operating in a different incentive regime than OpenAI or Anthropic. The state-led funding round now in progress will change that structure to some degree, which is part of why the loss-leader question is harder to dismiss than it would have been six months ago.

What this leaves us with

My read is that some loss-leader dynamics are probably present: the timing of the permanent cut, the state-led funding round, the company’s silence on whether Huawei’s Ascend chip supply has actually scaled, the convenient alignment with a valuation push. The capital structure is shifting in ways that make the strategic reading more plausible than it was for V2 or R1.

But the structural conditions that make loss-leading work as a long-term strategy, which is proprietary lock-in, switching costs, captured ecosystems, aren’t there in the way the standard template assumes. So even if DeepSeek wants to run that play, the game board doesn’t support it the way it would for a Western SaaS company. The technical openness and the financial closing-in are pulling in different directions.

The other thing worth noting is what the framing reveals about how we read these stories. Western business commentary defaults to the templates it knows: predatory pricing, growth-stage burn, eventual monetization. Those templates work well for the companies they were built around. They work less well for an open-weight research lab inside a hedge fund operating with structural compute advantages and a multi-year pattern of architectural cost reduction — even one now taking state capital. Reaching for the familiar frame is not lazy, but it’s what happens when the available reference patterns don’t quite fit the case.

For the metering argument from last week, none of this changes the basic claim that enterprise AI bills are about to look very different. What it changes is the floor those bills are converging toward. A Western enterprise buying Claude or GPT for compliance, latency, and procurement reasons will keep buying Claude or GPT. But the anchor point in every pricing conversation just moved. The repricing notices going out from GitHub, the metered shift Nadella described as inevitable across every per-user business, all of that lands differently when the buyer can point to V4-Pro and ask what the Western markup is paying for.

The meter and the floor are both legibility events. We’re going to find out what AI actually costs, from both directions, faster than the industry consensus assumed. That number is going to be different in different places. The gap between those numbers is going to be one of the more important things in the AI economy.

A note on “multimodal”

One of the things I wanted to settle in my own mind before writing any of this was whether a specific criticism of V4 that has circulated in recent weeks, that DeepSeek isn’t really a peer to GPT-5.5 or Claude Opus 4.7 because it isn’t multimodal, actually holds up.

“Multimodal” is one of those terms that does a lot of work without explaining itself. It just means a model that can take in more than one kind of input. Text plus images, mostly, with some video on certain Chinese consumer deployments. For example, a user uploading a screenshot for analysis, a developer pointing a model at a UI mockup, a document workflow pulling from scanned PDFs; those are multimodal tasks. The capability matters because a lot of real-world enterprise work — document processing, quality control, accessibility audits — obviously requires a model that can see.

The criticism that V4 doesn’t do this turns out to be reading from notes that are roughly six months old. V4 ships with a three-mode architecture: Fast, Expert, and Vision — and Vision is native rather than bolted on. The lineage goes back to DeepSeek-VL2 in December 2024 and the DeepSeek OCR paper in October 2025. One independent analysis found the V4 vision system uses about a tenth of the inference cache of Claude’s vision processing per image, which is itself an efficiency story rather than a capability gap.

The point here isn’t to defend V4’s capability against every benchmark, but it trails the absolute frontier on some agentic coding and graduate-level reasoning tasks, and DeepSeek itself says it’s about three to six months behind. The point is that the specific framings circulating in Western coverage tend to lag the actual product, in directions that make the standard skeptical reading easier to sustain than the facts justify. That is something worth keeping in mind.

Leandro Oliva

Discussion about this post

Ready for more?