TMTPOST -- AI suddenly got more expensive.
In March 2026, prices for some AI models on Tencent Cloud rose by more than 400%. Alibaba Cloud also raised the prices of AI services for certain models on its Bailian platform multiple times within a single month.
A friend of mine who works in content did the math: using API calls to a large model, he batch-generated a week’s worth of short-video scripts (about 50). When billed by token usage, it cost about RMB 15 three months earlier. After many AI relay hubs were shut down in March, and as major cloud vendors raised model prices one after another, the same workload now costs close to RMB 60.
At the beginning of 2024, China’s daily token calls averaged 100 billion. By March 2026, that number had become 140 trillion.
In two years, that’s growth of more than 1,000x.
What is a token?
If you ask Qwen, “What’s the weather like in Shenzhen today?” that’s roughly 200 tokens behind the scenes (including the question and the reply). If you ask DeepSeek to write you a 5,000-character project plan, that’s roughly 8,000 to 10,000 tokens. Tokens are the basic units AI uses to process information—and the unit used to bill compute. You can think of them as AI’s word-and-character counter: the more you use, the more compute you burn.
So if AI has suddenly gotten more expensive, is it just vendors trying to raise prices?
Behind the Price Hikes
JPMorgan projected that by 2030, China’s AI inference token consumption would be about 370 times higher than in 2025. Goldman Sachs projected that over the same period, global token consumption would increase 24-fold.
The conclusion is the same: AI’s compute consumption is entering a track of exponential growth.
The reason, ultimately, is that the way people use AI has changed.
In the past, we typically asked AI a question, it answered with a few dozen tokens, and that was the end of it. But now, more and more people are asking AI to “do things for me.”
For example: “Help me plan a five-day trip, including flights, hotel recommendations, sightseeing routes, and food recommendations.” AI won’t do that in one shot—it needs multiple rounds of reasoning, calls to multiple tools, and step-by-step execution—in the industry, this is called an Agent. A single Agent task can consume 100 to 1,000 times as many tokens as a normal Q&A.
From Q&A to execution, compute consumption differs by three orders of magnitude.
On the supply side, AI infrastructure spending by leading cloud providers such as ByteDance, Alibaba, and Tencent surged. From 2024 to 2025, spending on core AI-chip components more than doubled from $22 billion to $52 billion. High Bandwidth Memory (HBM) rose from about 50% of a GPU’s bill of materials to roughly two-thirds. Chips are expensive, memory is expensive, and electricity is expensive—driving an across-the-board increase in computing costs.
So price hikes aren’t really a choice for vendors; demand has been too strong, and capacity simply can’t keep up.
This is a lot like mobile communications around 2010. Smartphones took off, but there weren’t enough base stations, so data was costly. Only after 4G and 5G were built out at scale did data get cheaper. AI is now at that same turning point.
Only this time, the ones building the “AI base stations” aren’t the three major telecom carriers, but a brand-new piece of infrastructure China has been rolling out—an integrated computing-power network.
The Black Hole of Computing Power
When people hear “companies using AI,” many instinctively picture a boss who doesn’t understand AI, but still demands that employees use it to review reports and write plans.
That’s thinking too small. The biggest chunk of AI’s computing consumption isn’t there at all.
Manufacturing. Companies like Foxconn and CATL use AI for visual quality inspection—cameras capture product images, and AI determines in real time whether each item is a pass or a defect. With just one production line running around the clock, millions of product images have to be processed every day, backed by continuous computing consumption.
Programmers. Tongyi Lingma and GitHub Copilot help developers write code and find bugs. In China, several million developers were already using AI-assisted programming every day, and every request consumes computing power.
The content industry. Using AI to generate copy, video scripts, and product descriptions. A significant share of the copy behind the short videos you scroll through every day has AI involved.
Healthcare. AI reads CT scans and detects early-stage lung cancer, and in some scenarios its accuracy has come close to—or even surpassed—human interpretation.
Scientific research. AI-assisted drug discovery has dramatically compressed the timeline from target identification to molecular screening—down from the years required by traditional workflows.
Agriculture. Satellite remote sensing combined with AI analysis can determine which plots lack water or have pest infestations, precisely guiding farmers on fertilization and pesticide application.
All of these scenarios share one thing in common: they consume massive amounts of computing power. And they’re not games played by a select few—manufacturing inspection affects the quality of every product, medical AI affects everyone’s health, and agricultural AI affects crop yields.
When computing power is in short supply, the pressure to raise prices won’t stop with cloud providers. It will move step by step along the chain, eventually reaching every ordinary person.
How do you solve it? China’s answer is to build a “high-speed rail network for computing power.”
The Responsibility of a Single Network
Water networks, next-generation power grids, computing power networks, next-generation communications networks, urban underground utility pipeline networks, and logistics networks—these are the “six networks” that the “15th Five-Year Plan” explicitly set out to build.
In essence, they are new types of infrastructure. The logic is the same as the large-scale push to build high-speed rail and 4G networks 20 years ago: the state invests first to get them built, driving down costs and barriers to entry so that the whole society benefits.
Why can a computing power network be placed alongside the power grid and the water network?
Because computing power has become the “electricity” of the AI era.
A hundred years ago, each factory bought its own generator and produced its own electricity. Later, once the national grid was in place, factories could simply plug in and use power—no need to generate it themselves.
A computing power network follows the same logic. Today, each company buys its own GPUs and builds its own data rooms: high costs, low utilization. Once the computing power network is in place, companies can purchase compute on demand, just like buying electricity—pay for what you use, without having to build their own “power plant.”
So how is this network dispatched? The core logic is known as “Eastern Data, Western Computing.”
China has planned eight major national computing hub nodes. Four are in the east—Beijing-Tianjin-Hebei, the Yangtze River Delta, the Guangdong–Hong Kong–Macao Greater Bay Area, and Chengdu–Chongqing—where users are concentrated and millisecond-level response is required. Four are in the west—Inner Mongolia, Guizhou, Gansu, and Ningxia—where green electricity is cheaper and land is plentiful.
AI inference workloads that require real-time response stay in the east. Less time-sensitive AI training workloads are dispatched to the west, running on cheaper green power.
This dispatching system is not just a concept—it is already up and running. Zhejiang and Xinjiang have used time-zone differences to shift load for peak shaving and valley filling, cutting the electricity cost per token by 18%. PUE (power usage effectiveness) at green data centers in the west can be pushed below 1.15, far lower than the national average.
In the era of large-model inference, the ratio of inference compute demand to training compute demand is expected to reach 3:1. This means that the computing power network of the future will be designed more around real-time responsiveness—and that is precisely where the value of “Eastern Data, Western Computing” lies.
Computation that Reaches for the Stars
The computing power network won’t exist only on the ground.
On May 14, 2025, China launched 12 space-based computing satellites into their planned orbits from the Jiuquan Satellite Launch Center aboard a Long March 2D carrier rocket, forming the world’s first large-scale space intelligent-computing satellite constellation.
These satellites are equipped with an “AI brain”—that is, high-performance intelligent computers and a distributed operating system. In the traditional model, images captured by satellites had to be sent back to the ground for processing, a process that could take hours. Compute-enabled satellites can analyze data in orbit, compressing disaster emergency response times to the level of seconds.
For early detection of wildfires and floods, precision agriculture, and ocean monitoring, satellites can “compute while they fly,” sending only key results back to the ground and dramatically easing the data-transmission bottleneck.
There is also a more far-reaching concept: building large-scale computing centers in space, turning space into an “offshore data center.”
At the end of 2025, Musk announced that SpaceX planned to expand the Starlink V3 satellite constellation and build a space-based data center in orbit; in early 2026 it had already submitted a formal application to the FCC for a constellation of one million satellites. Bezos also predicted in October 2025 that gigawatt-scale data centers would be built in space within the next 10 to 20 years. So, this is theoretically feasible, but technologically it is still a long way from practical deployment. China’s current priority is the first step: networking compute satellites.
Ground-based computing networks solve the problem of “not enough compute,” while space-based computing networks tackle the bottleneck where massive volumes of satellite data can’t be transmitted back—or even if they are, can’t be processed fast enough. These two paths being pursued in parallel are at different stages, but they point in the same direction: making compute available everywhere.
A Trillion-Yuan-Scale Undertaking
During the 2026-2030 period, total investment in computing networks is expected to reach 2 to 2.5 trillion yuan.
In 2026 alone, total investment across the “six networks” is set to exceed 7 trillion yuan, with annual investment in computing networks no less than 400 billion yuan.
As of early 2026, the national computing interconnection platform had connected 31 provinces (autonomous regions and municipalities), 155 companies, and 578 resource pools, including 316 EFLOPS of intelligent computing capacity and 720,000 GPU accelerator cards.
The funding sources are also clear. In 2026, the central government allocated more than 2.55 trillion yuan in policy funds on a coordinated basis to support construction of the “six networks,” including 800 billion yuan in ultra-long-term special government bonds that could be used for computing networks. These bonds have maturities of 20 to 50 years, carry low interest rates, and do not add to local governments’ hidden debt.
Market-based capital is pouring in as well. ByteDance, Alibaba, Tencent, Baidu, and the three major telecom operators are all stepping up investment related to computing networks.
Signals of broader accessibility are also beginning to appear. An intelligent technology company in Nantong has successfully obtained a 15,000-yuan subsidy via compute vouchers, and models such as “compute banks” and “compute supermarkets” are being piloted in multiple regions.
All of this points in one direction: to keep driving AI compute costs down so that more people can actually afford to use AI.
Silent, Universal Access
Once the “compute grid” is in place, how will your life change?
First, your cost of using AI will drop.
Today, when you subscribe to Doubao or pay for the premium version of Qwen, what you’re really paying for behind the scenes is compute. After the compute grid is built out, the price of AI services will very likely keep falling—much like mobile data went from 10 yuan per MB to today’s unlimited plans.
Second, more people will be able to take part in AI entrepreneurship.
Right now, to build an AI app, you often have to buy dozens of GPUs upfront, with investments easily running into the millions. Once the compute grid becomes widespread, “compute vouchers” and pay-as-you-go pricing will dramatically lower the barrier. If you have a good idea, you won’t have to raise money first just to buy equipment.
Third, more AI capabilities will show up in places you don’t even notice.
Precision agriculture will raise crop yields and make food prices more stable. AI in healthcare will make early screening more common and checkups cheaper. Smart transportation will reduce congestion and shorten commutes. None of these will be presented to you with an “AI” label, but you’ll feel that life has gotten better.
On the surface, China is building this network to solve the problem of “not having enough compute.” But we also need to be clear: AI shouldn’t be only a weapon for big companies. Like electricity, it should become basic infrastructure that everyone can use.
Today’s 140 trillion tokens could become 1.4 quadrillion by the day the compute “high-speed rail” is completed. And by then, the cost of using AI may be only a fraction of what it is today.










