THE COMPANY THAT ACCIDENTALLY BECAME THE INTERNET’S ENGINE ROOM

Faisal Mahmud

June 4, 2026
9:33 am
Tech

NVIDIA set out to make graphics cards for gamers. Three decades later, it quietly ended up owning the hardware that runs modern artificial intelligence and, with it, an uncomfortable amount of leverage over the entire tech world.
There is a strange kind of power that comes not from being the biggest, the loudest, or the most aggressive, but from being the thing everything else quietly depends on. NVIDIA has that kind of power. It is infrastructure power. The kind that makes itself felt not in press releases or keynote speeches but in the cold arithmetic of what breaks if you take it away.
Think about what runs a modern AI model. Not the software. Not the algorithm written by some PhD who drinks too much coffee. The actual computation, the billions of matrix multiplications per second that translate a text prompt into a coherent paragraph or a chest X-ray scan into a diagnosis. That computation, in an overwhelming share of cases, happens on NVIDIA hardware. Specifically on chips branded H100 or A100, cooled by industrial fans in data centers the size of aircraft hangars, owned by Amazon or Microsoft or Google, and quietly rented out by the second to thousands of companies and researchers around the world.
NVIDIA did not plan any of this. Or at least, not all of it. Jensen Huang, who co-founded the company in a Denny’s diner in California in 1993, wanted to build graphics processors. The GPU was a product aimed squarely at gamers who wanted smoother frame rates and more realistic explosions. For most of the 1990s and 2000s, that is exactly what NVIDIA was: a very good, very focused graphics chip company, locked in a fierce rivalry with ATI (later absorbed by AMD) over who could render more polygons per second.

CUDA: THE BET THAT CHANGED EVERYTHING

The real turning point was not a product. It was a software decision that, at the time, looked to most observers like a strange and expensive vanity project.
In 2006, NVIDIA released CUDA, Compute Unified Device Architecture. The idea was simple but radical: let developers program GPUs not just for graphics, but for any highly parallel computation. GPUs, it turns out, are structurally different from CPUs. Where a CPU has a handful of very powerful cores optimized for sequential tasks, a GPU has thousands of smaller cores that can all run simultaneously. For graphics, this made sense. Rendering pixels is embarrassingly parallel work. But it also made GPUs extraordinarily well suited for anything that required doing many similar calculations at once. Things like simulating physics. Doing financial modeling. And, as it turned out, training neural networks.
The machine learning community noticed. A landmark 2012 paper by Geoffrey Hinton’s team at the University of Toronto, the famous AlexNet, used NVIDIA GPUs to train a deep neural network that blew past everything else in an image recognition competition. The speedup compared to CPU-based training was not marginal. It was transformative. Suddenly every serious AI researcher wanted NVIDIA hardware, and more importantly, they wanted CUDA, because years of libraries and tools had already been built on top of it.
This is the part of the story that is easy to miss. The GPUs themselves are impressive hardware, but hardware can theoretically be replicated. What is harder to replicate is the software ecosystem. CUDA has had a twenty-year head start. Frameworks like TensorFlow and PyTorch are built around it. Research papers assume it. PhD students learn on it. When a new AI startup spins up and needs to train a model, they do not re-evaluate the chip ecosystem from first principles. They reach for what the entire field already knows how to use. That network effect is NVIDIA’s real moat, and it is deeper than most people outside the industry appreciate.

THE DATA CENTER BECOMES THE PRODUCT

For most of NVIDIA’s history, gaming was its bread and butter. GeForce graphics cards were what kept the lights on, and the company’s data center revenue was a smaller, growing, but secondary line of business. That changed with breathtaking speed.
The arrival of large language models, GPT-3 in 2020, then the explosion of ChatGPT in late 2022, created demand for compute that the world had simply never seen before. Training a large language model requires not a single GPU but thousands of them, running in parallel for weeks or months at enormous cost. OpenAI’s GPT-4 training reportedly cost over $100 million in compute alone. The models that came after it were larger still. Every major tech company- Google, Meta, Amazon, Microsoft- embarked on their own AI infrastructure buildouts simultaneously.
What followed was the GPU shortage that defined the AI industry’s awkward adolescence. Companies that had been planning to build AI products suddenly found themselves unable to get the hardware they needed. H100 chips that nominally cost around $30,000 were trading on secondary markets for two or three times that. Cloud access to GPU clusters was oversubscribed for months. Venture capitalists, in a genuinely surreal turn, began treating “GPU allocation” as a competitive advantage in due diligence conversations. Not code quality, not team pedigree, but raw access to NVIDIA chips.
NVIDIA’s revenue figures tell the story numerically, but they do not fully capture the structural shift they represent. This is no longer a company whose primary customers are teenagers buying graphics cards. It is a company whose primary customers are the largest corporations on earth, purchasing infrastructure to build products that will define the next decade of the internet.

WHAT THIS MEANS FOR THE REST OF THE INDUSTRY

There is a paradox sitting at the heart of the relationship between NVIDIA and the major cloud providers. AWS, Google Cloud, and Microsoft Azure are simultaneously NVIDIA’s biggest customers and among its most motivated potential rivals. They buy NVIDIA GPUs in quantities that beggar belief, then rent access to those GPUs as one of their most profitable cloud services. Every time someone uses a cloud AI API, there is a reasonable chance an NVIDIA chip is involved, and a percentage of the fee flows back, indirectly, to Santa Clara.
The cloud companies do not love this arrangement. They are paying a supplier they cannot easily replace for a component that underpins their most strategically important products. So all three have developed their own custom AI chips. Google has its TPUs. Amazon has Trainium and Inferentia. Microsoft has been developing its own AI silicon. The explicit goal is dependency reduction, to claw back some of the margin and strategic control that flows to NVIDIA in the current arrangement.
How well this is working is debatable. TPUs are genuinely excellent for certain workloads, particularly inference at scale, and Google has used them internally for years. But the developer ecosystem around them remains far thinner than CUDA’s. Rewriting training pipelines to use non-NVIDIA hardware is not impossible, but it is expensive, and companies that are already racing against competitors to ship AI products are reluctant to slow down for an infrastructure migration that might or might not pay off.
If you are AMD or Intel right now, the NVIDIA situation is frustrating in a very specific way. It is not that you cannot build a fast chip. AMD’s MI300 series has matched or exceeded NVIDIA hardware on certain benchmarks. Intel has invested enormous sums in its AI accelerator programs. The hardware gap, while real, is not insurmountable. The software gap is the actual problem. When a developer sits down to write AI training code, they write it in PyTorch, which runs on CUDA, which runs on NVIDIA GPUs. Moving to AMD or Intel hardware requires either rewriting code or relying on compatibility layers that introduce their own overhead and quirks. For a researcher who has spent five years building intuitions about how CUDA behaves, switching platforms feels like moving to a country where you do not speak the language. You can do it. But it adds friction at every step.
THE GEOPOLITICS OF SILICON
NVIDIA is not just a story about chips and software. In the past three years, it has become a story about national security, international competition, and the uncomfortable intersection of commerce and geopolitics.
The United States government has concluded that allowing China to freely access advanced AI chips poses an unacceptable national security risk. The logic is not difficult to follow: whoever leads in AI leads in military intelligence, autonomous weapons, surveillance infrastructure, and economic productivity. Giving a geopolitical rival free access to the best tools for building AI capability is, in this framing, analogous to selling them advanced fighter jet engines.
The result has been an expanding and tightening regime of export controls on high-end NVIDIA chips. The H100 was initially restricted, then NVIDIA released a modified version for the Chinese market with reduced capabilities. Those chips were subsequently restricted too. NVIDIA has had to navigate a regulatory environment that is simultaneously trying to preserve its ability to serve a huge market and trying to prevent that market from building weapons.
The downstream effect on China’s domestic chip industry has been the opposite of what a naive reading of the controls might predict. Rather than simply falling behind, Chinese companies, Huawei most prominently, have accelerated their own development of AI chips. Huawei’s Ascend series is genuinely competitive for some workloads, and the Chinese government has poured investment into the domestic semiconductor ecosystem. The export controls have not kept China from developing AI. They may have simply told China it needed to stop depending on American chips and start building its own.

THE QUESTION NOBODY WANTS TO ASK

Here is the uncomfortable question that does not get asked often enough in the breathless coverage of NVIDIA’s ascent: what happens when the AI buildout slows down?
The capital expenditure cycle that has driven NVIDIA’s extraordinary growth is, by definition, a cycle. Cloud companies are currently spending at rates that most financial analysts consider unsustainable in the long run. At some point, the data centers are built. The infrastructure is in place. The marginal return on adding another rack of H100s declines. Capital allocation decisions change.
There is also the question of whether the current generation of AI technology- large language models, diffusion models, transformer architectures- will continue to scale in the way that has driven demand for more and more compute. The history of technology suggests that the law of diminishing returns eventually applies to any given paradigm. If AI progress requires ten times more compute for each significant improvement in capability, the economics of that improvement become harder to justify at some point.
None of this is to say NVIDIA’s future is bleak. The company has diversified broadly, into automotive systems, industrial simulation, scientific computing, and healthcare imaging. Jensen Huang is a genuinely strategic thinker who has shown the ability to reinvent the company’s positioning multiple times. But valuations that briefly made NVIDIA one of the world’s most valuable companies price in a future of sustained exponential growth that has historically been difficult for any company to maintain.

THE BIGGER PICTURE: INFRASTRUCTURE AS POWER

What NVIDIA’s story illustrates, more than anything about GPUs or machine learning, is a principle about how technological infrastructure creates and concentrates power.
We have seen this pattern before. Microsoft did not just sell software. It controlled the platform that every PC application had to run on, and that control gave it leverage over the entire software industry for a generation. Google did not just offer a better search engine. It became the infrastructure through which most of the world’s information seeking happened. Amazon did not just build a better bookstore. It built cloud infrastructure that the majority of the tech industry came to depend on.
NVIDIA is the latest instantiation of this pattern, but with a twist. The others achieved their infrastructure status primarily through software and network effects. NVIDIA has achieved it through hardware, physical chips that require billion-dollar fabrication facilities to produce and years of software ecosystem development to make usable. That combination is arguably harder to displace than a software platform because the barriers to entry are not just intellectual but physical and financial.
The companies that depend on NVIDIA are aware of this dynamic. So is the US government. So are NVIDIA’s rivals. The frantic investment in alternative AI chips, the export controls, the custom silicon programs at the cloud providers, all of these are, at some level, attempts to manage or reduce a dependency that has become structurally important in ways that make people uncomfortable.
That discomfort is, in a strange way, the most honest measure of how much power NVIDIA has accumulated. Not the market cap. Not the revenue. The fact that the world’s largest technology companies, several national governments, and armies of researchers and engineers are all spending significant resources trying to reduce their reliance on a company founded to make graphics cards for video games.
What NVIDIA has become is a once-in-a-generation type of infrastructure player, the kind that bends industry dynamics around itself not through aggression or monopolistic behavior, but through the quiet, compounding power of being genuinely necessary.
Whether that necessity persists, whether the bets on AI compute continue to pay off at the scale the market is pricing in, and whether the geopolitical pressures ultimately squeeze or redirect the company, these are open questions. What is not open is the question of whether NVIDIA matters. It matters in the way that roads matter, or electrical grids matter. Not glamorously. Not obviously. Just fundamentally.
The most powerful position in any ecosystem is not to be the best product. It is to be the thing that makes all the other products possible.