A server rack, blinking silently in some Virginia office park, ready to be extended when demand caught up, was a common mental image for anyone discussing the limits of computing for the majority of the previous ten years. Now, that picture is deceptive. There is no longer a congestion. The substation powering the rack, the memory chip soldered next to the GPU, the atoms within the transistor, and the cables connecting thousands of accelerators that increasingly need to function as a single organism are more important factors in today’s computing problem than racks themselves. What’s growing isn’t the fascinating aspect of the discussion. It’s the refusal.
Power is the most evident limit. In the past, data center developers selected locations based on factors like road access, tax incentives, and fiber access, just like commercial real estate brokers choose office buildings. Megawatts are now their choice. Operators frequently run out of grid capacity before they run out of land because AI training clusters use so much electricity and so intensively. Costs have increased in line with this. Previously costing about $10 million per megawatt, building density now costs up to $40 million. The need for AI data centers in the United States is expected to increase from about 5 gigawatts to over 150 gigawatts in ten years. That is not an expansion of the current grid. That grid is different.
The consumer electronics sector is beginning to notice the memory scarcity, although it’s unclear why. Large language models require enormous amounts of high-bandwidth memory, such as DDR5 modules for the host CPUs and HBM stacks for the accelerator side. Hyperscalers have been purchasing production allocations from Samsung, SK Hynix, and Micron well in advance, often with contracts lasting more than a year, depriving other companies of availability. The downstream effects are appearing in unexpected places, such as the cost of a mid-tier graphics card, enterprise server wait times, and laptop pricing. In essence, the compute industry has split into a primary and secondary market, with the latter receiving what is left over.
As this develops, it seems that the public discourse surrounding AI has continuously underestimated the extent to which the bottleneck is located outside the GPU. Industry focus revolves around Nvidia’s release schedule, which includes the H100s, H200s, Blackwell, and whatever comes after. However, because the current generation of agentic workflows necessitates constant step-by-step orchestration, database calls, and tool use that significantly rely on host processors, CPUs have also become legitimately limited.
There are limitations specific to the networking layer. The amount of time data spends traveling between accelerators begins to outweigh the amount of time it spends being processed in clusters with ten thousand or more nodes. Cores do nothing except wait. Engineers refer to it as “starvation,” which is a more truthful term than the industry occasionally prefers.

Then there is the silicon itself, which is the limit that is hard to dispute. The structures of today’s cutting-edge chips are made up of just a few dozen silicon atoms. At that scale, during extended training runs, common heat stress may result in temporary breakdowns. Wafer yields decrease. Instead of being a guarantee, reliability turns into a statistical exercise. The advances grow more challenging and costly as TSMC, Samsung Foundry, and Intel Foundry push the same physical boundary at varying speeds. In theory, Moore’s Law is still applicable. However, maintaining it in practice has become incredibly expensive, and this expense eventually permeates every layer of the computational stack.
The larger picture of all this is difficult to ignore. The AI sector portrays itself as software-defined and nearly weightless, with models, prompts, and intelligence presented via a browser tab. Physical, weighty, and sluggish to scale is the truth beneath. Building substations takes years. Foundries require more time. Memory plants have capacity cycles that last several years.
Surprisingly, a lot of the factors that determine whether the next generation of AI products ships on schedule have nothing to do with researchers and a lot to do with whether a certain utility in Texas, Virginia, or Arizona can locate the transformers it bought eighteen months ago. It remains to be seen if the industry can engineer its way above these constraints. It’s already evident that “scaling” is no longer a software issue. The bottleneck has entered the world.