Nvidia CEO Jensen Huang delivers a keynote speech during the Nvidia GTC AI conference at SAP Center on March 18, 2024 in San Jose, California.
Justin Sullivan | Getty Images
The new generation of AI-powered graphics processors is called Blackwell. Blackwell's first chip is called the GB200 and will ship later this year. Nvidia has been luring its customers with more powerful chips to stimulate new orders. For example, companies and software makers are still scrambling to obtain current generation “Hopper” H100 chips and similar chips.
“Hopper is great, but we need bigger GPUs,” Nvidia CEO Jensen Huang said Monday at the company's developer conference in California.
Nvidia shares fell more than 1% in extended trading Monday.
The company also introduced a revenue-generating program called NIM that will facilitate the deployment of AI, giving customers another reason to continue using Nvidia chips amid a growing field of competitors.
Nvidia executives say the company has become less of a mercenary chip provider, and more of a platform provider, like Microsoft or Apple, on which other companies can build software.
“Blackwell is not a chip, it is the name of a platform,” Huang said.
“The commercially salable product was the GPU and the software was intended to help people use the GPU in different ways,” Nvidia Foundation Vice President Manuvir Das said in an interview. “Of course, we still do that. But what's really changed is that we actually have a business software company now.”
Das said Nvidia's new software will make it easier to run software on any of Nvidia's GPUs, even older ones that may be better suited for deployment but not for building AI.
“If you're a developer, and you have an interesting model that you want people to adopt, if you put it into NIM, we'll make sure it's playable on all our GPUs, so you can reach a lot of people,” Das said.
Nvidia's GB200 Grace Blackwell Superchip, with two B200 GPUs and one Arm-based CPU.
Every couple of years Nvidia updates its GPU architecture, unlocking a huge jump in performance. Many of the AI models released over the past year have been trained on the company's Hopper architecture — which chips like the H100 uses — which will be announced in 2022.
Nvidia says Blackwell-based processors, like the GB200, offer a massive performance upgrade for AI companies, with 20 petaflops in AI performance versus 4 petaflops for the H100. Nvidia said the additional processing power will enable AI companies to train larger, more complex models.
The chip includes what Nvidia calls a “transformer engine specifically designed to power transformer-based AI, one of the core technologies underlying ChatGPT.”
The Blackwell GPU is large and combines two separately manufactured dies into a single chip made by TSMC. It will also be available as a complete server called the GB200 NVLink 2, combining 72 Blackwell GPUs and other Nvidia parts designed to train AI models.
Nvidia CEO Jensen Huang compares the size of the new “Blackwell” chip with the current “Hopper” H100 chip at the company's developer conference, in San Jose, California.
Nvidia
Amazon, Google, Microsoft and Oracle will sell access to the GB200 through cloud services. The GB200 combines two B200 Blackwell GPUs with a single Arm-based Grace CPU. Nvidia said Amazon Web Services will build a server farm containing 20,000 GB200 chips.
Nvidia said the system can deploy a model containing 27 trillion parameters. This is much larger than even the largest models, such as GPT-4, which is said to contain 1.7 trillion parameters. Many AI researchers believe that larger models contain more parameters and data It can unlock new abilities.
Nvidia did not provide a cost for the new GB200 or the systems used in it. Nvidia's Hopper-based H100 costs between $25,000 and $40,000 per chip, with complete systems costing up to $200,000, according to analyst estimates.
Nvidia will also sell the B200 GPUs as part of a complete system that takes up an entire server rack.
Nvidia also announced that it is adding a new product called NIM, which stands for Nvidia Inference Microservice, to its Nvidia Enterprise program subscription.
NIM makes it easier to use legacy Nvidia GPUs for inference, or the process of running AI programs, and will allow companies to continue using the hundreds of millions of Nvidia GPUs they already own. Inference requires less computational power than initial training of a new AI model. NIM allows companies that want to run their own AI models, rather than purchasing access to AI results as a service from companies like OpenAI.
The strategy is to get customers who buy Nvidia-based servers to subscribe to Nvidia Enterprise, which costs $4,500 per GPU per year to license.
Nvidia will work with AI companies like Microsoft or Hugging Face to ensure its AI models are tuned to run on all compatible Nvidia chips. Then, using NIM, developers can efficiently run the model on their own servers or Nvidia's cloud-based servers without a lengthy configuration process.
“In my code, where I was calling OpenAI, I would replace one line of code to point it to the NIM that I got from Nvidia instead,” Das said.
Nvidia says the software will also help AI run on GPU-equipped laptops, rather than servers in the cloud.