The Lifeblood of the AI Boom

Applications such as ChatGPT and DALL-E have captured the world’s imagination—but AI companies are focused on something else.

Latest Mar 6, 2024 0 Add to Reading List

Artificial intelligence can appear to be many different things—a whole host of programs with seemingly little common ground. Sometimes AI is a conversation partner, an illustrator, a math tutor, a facial-recognition tool. But in every incarnation, it is always, always a machine, demanding almost unfathomable amounts of data and energy to function.

AI systems such as ChatGPT operate out of buildings stuffed with silicon computer chips. To build bigger machines—as Microsoft, Google, Meta, Amazon, and other tech companies would like to do—you need more resources. And our planet is running out of them.

The computational power needed to train top AI programs has doubled every six months over the past decade and may soon become untenable. According to a recent study, AI programs could consume roughly as much electricity as Sweden by 2027. GPT-4, the most powerful model currently offered to consumers by OpenAI, was by one estimate 100 times more demanding to train than GPT-3, which was released just four years ago. Google recently introduced generative AI into its search feature, and may have increased costs per search tenfold in the process. Meanwhile, the chips running AI are in short supply—OpenAI CEO Sam Altman told Congress last May that “we don’t have enough” of them—and so is the electricity. At our current pace, there may soon not be enough energy in the world to run more advanced models without putting tremendous strain on local power grids. Even if there were, purchasing all that electricity would be prohibitively expensive.

[Read: AI is taking water from the desert]

American tech companies worship at the altar of scale—the creed that throwing more computers, electricity, and data at an AI program is certain to improve it—so this won’t do. A new AI competition is now under way to devise hardware that would allow the technology to grow more powerful and more efficient. This, more than dazzling applications such as OpenAI’s video-generator, Sora, will decide the technology’s future—which companies dominate, what AI products they are able to bring to market, and how expensive those products will be. So far, that competition’s clear winner has not been a traditional tech titan. Instead it is Nvidia, a firm that, until roughly a year ago, was unheard of outside the realm of dedicated computer gamers—and is now the third-most valuable company in the world, dwarfing Google, Amazon, and Meta.

Nvidia’s wealth comes from designing the most vital part of AI machinery: computer chips. Wafer-thin rectangles printed with an intricate web of silicon, these chips carry out the code underlying chatbots, image generators, and other AI products. Nvidia’s graphics-processing units, or GPUs, were previously known for increasing the visual fidelity of video games. The same sort of equipment that grants a PC the power to render more realistic lighting in Call of Duty can also train cutting-edge AI systems. These GPUs are among the fastest and most reliable chips available, and they made the emerging AI revolution possible.

To support the continued growth of AI, tech companies have collectively embarked on an infrastructure build-out with costs that might soon rival those of the Apollo missions and the interstate highway system: Tens of billions of dollars, if not more, are being spent on cloud-computing capacity each year. Nvidia, in turn, controls as much as 95 percent of the market for specialized AI chips; recent generative-AI programs from Microsoft, OpenAI, Meta, Amazon, and elsewhere likely could not be built or run on computers around the world without Nvidia’s hardware.

When ChatGPT debuted, it felt like magic. Every other tech company raced to debut their own version; the competition was over software, which Nvidia hardware largely facilitated. Now the top three language models—OpenAI’s GPT-4, Google’s Gemini, and the latest version of Anthropic’s Claude—are neck and neck in performance; price is at least as important a differentiator as capability. Purchasing and powering all those AI chips is the most expensive part of the technology, Jai Vipra, an AI-policy researcher and incoming fellow at IT for Change, told me. And “Nvidia is the entity that sets the price.”

None of the Big Tech companies appears thrilled about that dependency, and they’ve all begun investing heavily in designing their own custom chips—which would enable not just bigger models but greater control over their emerging AI businesses. Having better computer chips could soon become a bigger competitive advantage than having better computer code, Siddharth Garg, an electrical engineer who designs machine-learning hardware at NYU, told me. Crucially, AI chips made in-house could be tailored to a company’s particular AI models, making its products more efficient and permitting growth without such intense energy demands.

Tech companies have executed versions of this strategy before. Your daily Google search, translation, and navigation queries run smoothly because, in the 2010s, Google designed custom computer chips that allowed the company to process billions of such requests each day with less energy and lower costs. Apple’s switch from Intel to its own computer processors in 2020 almost instantly allowed the company to produce a faster, lighter, and thinner MacBook. Similarly, if Amazon’s custom chips run AI products faster, people might prefer its cloud services over Google’s. If an iPhone, Google Pixel, or Microsoft Surface tablet can run a more powerful generative-AI model and load its results a bit faster because of a custom microchip, then more customers might buy that device. “That’s a game changer,” Garg said.

Every company wants its own, self-contained kingdom, no longer beholden to competitors’ prices or external supply-chain snags. But whether any of these cloud-computing tech companies can rival Nvidia is an open question, and any of them severing ties with Nvidia is highly unlikely. The future may well be one in which they use both bespoke computer chips and Nvidia’s designs.

[Read: The flaw that could ruin generative AI]

Google, for instance, has been able to train and run its flagship Gemini models with less energy and at lower cost by using custom computer processors rather than relying on Nvidia, according to Myron Xie, who works at the semiconductor-research firm SemiAnalysis. But many of the company’s cloud servers also run on Nvidia chips, and Google optimized its latest language model, Gemma, to run on Nvidia GPUs. Amazon markets its custom AI chips as “delivering the highest scale-out ML training performance at significantly lower costs,” and David Brown, the vice president of Compute and Networking at Amazon Web Services, told me over email that computer chips are “a critical area of innovation.” But the firm is also growing its partnership with Nvidia. A Microsoft spokesperson told me, “Our custom chips add to our systems rather than replace our existing hardware powered by NVIDIA.”

This vision of an all-encompassing AI ecosystem could also be a way of ensnaring customers. Owning an iPhone and a Macbook makes it more convenient to use iMessage, iCloud, an Apple Watch, and so on. That same logic may soon apply to AI: Google Gemini, a Google Chromebook, a Google Pixel, Google’s custom AI chips, and Google Cloud services will all be optimized for one another. OpenAI is reportedly developing AI “agents” that can automate tasks on various devices. And Apple has pivoted its business toward generative AI. “It’s sort of a method of vertical integration, locking people into your stack,” Sarah Myers West, the managing director of the AI Now Institute, told me.

Beyond chip design, tech companies are investing heavily in developing more efficient software as well as renewable-energy sources. At the World Economic Forum, in January, Altman said, “We still don’t appreciate the energy needs of this technology … There’s no way to get there without a breakthrough.” Efficiency improvements may not just be about making AI environmentally sustainable, then. They may be necessary to make the technology physically and financially viable at all.