AI

Latest MLPerf result bolsters Nvidia Gigawatt AI Factories pitch

Nvidia always excels at each of the newest rounds of MLPerf benchmarks and the latest result is another entry into the company’s continuing prowess in GPU training capabilities.

What just happened in this MLPerf Training v5.0 round is Nvidia’s promotion of its benchmarking achievements as an integral part of its ambitious work to build out what it calls AI factories, where Nvidia takes on more than a GPU role, and works with networking, interconnect and other components used in massive rack-scale systems, including the largest data centers.

In a briefing with reporters, Nvidia’s Director of Accelerated Computing Products Dave Salvator, described improvements in time to train metrics as essential to keep up with the “torrid pace of innovation” with agentic AI.

He said Nvidia’s AI platform delivered the highest performance at scale on every benchmark and powered every result submitted on the benchmark’s toughest large language model test: Llama 3.1 405B pretraining.Nvidia’s was the only platform that submitted results on all the v5.0 benchmarks.

Nvidia submitted results from use of two AI supercomputers powered by Nvidia Blackwell: Tyche, which uses GB200 NVL72 rack systems and Nyx, using Nvidia DGX B200. Blackwell delivered 2.6 times greater performance per GPU compared with previous generation Hopper architecture on Llama 3.

In a blog, Nvidia said agentic AI-powered applications will one day run in AI factories, which CEO Jensen Huang and others have termed the “engines of the AI economy.”  The apps will produce tokens and intelligence for almost every industry and academic institution, Nvidia said.

Salvator noted Nvidia’s scaling efficiency is 90% with 2500 GPUs and 512 GPS, offering a business benefit of training performance per dollar.  Engineers are happy with 70% scaling; “at 90% they are doing happy dances,” Salvator said. “We’re really pleased with the overall set of results.”

While GPUs have been Nvidia’s heritage, the company has progressed to offering entire racks and data centers and even reference designs for how to build out infrastructure. “Now we have AI factories,” he said. “Reasoning tokens will increase computational demand by multiple factors.”

Noting the v5.0 achievement, he offered a roadmap slide (below) and reminded reporters of the one-year cadence Nvidia has announced for new products “to let folks know what’s coming and help them build gigawatt AI factories.”  He said the GB300 is too new to have been included in the v5.0 round, which is why GB 200 was included. Previous-gen Hopper continues to be market leading, he added.

Huang had described at GTC the Nvidia roadmap through 2028, going from Blackwell to Rubin and then Feynman, using rack designs named Oberon and Kyber.  Feynman is an Arm-based architecture with cores customer made by Nvidia. 

chart of gpu's through 28
chart of gpu's through 28