Faced with questions and concerns regarding how emerging more efficient and lower cost methods of AI computing will affect the market for massively powerful and power-hungry GPUs, Nvidia chief Jensen Huang did not flinch, and suggested that the demand for more powerful AI computing infrastructure is only increasing.
During Nvidia’s fiscal fourth quarter 2025 earnings call, Huang answered an initial question about the changing dynamics in the transition from AI training to AI inference with a lengthy statement making the case for even more AI hype (quoted from the Motley Fool earnings transcript):
“There are now multiple scaling laws. There's the pre-training scaling law, and that's going to continue to scale because we have multimodality. We have data that came from reasoning that are now [using] to do pre-training. And then the second is post-training skilling, using reinforcement learning human feedback, reinforcement learning AI feedback, reinforcement learning, verifiable rewards. The amount of computation you use for post-training is actually higher than pre-training. And it's kind of sensible in the sense that you could, while you're using reinforcement learning, generate an enormous amount of synthetic data or synthetically generated tokens. AI models are basically generating tokens to train AI models. And that's post-trade. And the third part, this is the part… is test-time compute or reasoning, long thinking, inference scaling. They're all basically the same ideas. And there, you have a chain of thought, you have search. The amount of tokens generated, the amount of inference compute needed, is already 100x more than the one-shot examples and the one-shot capabilities of large language models in the beginning. And that's just the beginning… The idea that the next generation could have thousands of times and even, hopefully, extremely thoughtful and simulation-based and search-based models that could be hundreds of thousands, millions of times more compute than today is in our future.”
This was Nvidia’s first earnings call since Chinese company DeepSeek shook the AI sector to its core with a claim that it was able to train an AI model for as little as $6.5 million (a figure now widely doubted) using techniques like reinforcement learning and inference-based statistical reasoning on a lower number of GPUs than companies like OpenAI have used to train their models. DeepSeek also used Nvidia H800 GPUs, the chips that Nvidia modified in order to be able to sell them into China. H800s have a lower data rate and are generally lower-priced (although that may be changing is growing demand) than Nvidia's premium H100 GPUs. The revelations of what was essentially a workaround hedge against US sanctions against China caused sharp drops in stock values, including Nvidia’s, across the AI sector.
Huang also said Nvidia’s newest Blackwell GPUs were actually designed with the rise of inference reasoning in mind. “Blackwell is going to be incredible across the board. And when you have a data center that allows you to configure and use your data center based on are you doing more pre-training now, post-training now, or scaling out your inference, our architecture is fungible and easy to use in all of those different ways.”
Nvidia’s fiscal Q4 2025 Blackwell sales of about $11 billion, according to Huang, back up this excitement. Huang added, “We defined Blackwell for this moment, a single platform that can easily transition from pre-trading, post-training, and test time scaling.”
These comments came during the Q&A portion of an earnings call that followed an earnings report that detailed another robust quarterly performance for Nvidia.
Fiscal Q4 2025 revenue came in at $39.3 billion, showing 78% year-over-year growth and setting a new quarterly record, while also beating analyst estimates that hovered around $38 billion. Fiscal full-year revenue reached $130.5 billion, representing growth of 114% over fiscal year 2024. Net income for Q4 2025 was just over $22 billion, way up from $12.2 billion reported for the same quarter a year ago.
Kress also provided a positive outlook on the current quarter, with overall revenue expected to reach $43 billion, plus or minus 2%.
“Continuing with its strong demand, we expect a significant ramp of Blackwell in Q1,” she said. “We expect sequential growth in both Data Center and Gaming.”
Yet, despite all the positive commentary and record revenue, implacable stock market investors drove down Nvidia’s share price by about 3.5% as of mid-day Thursday, when shares stood at $127.42.
In addition, despite the revenue success and bright outlook, and even though Nvidia continues to hype demand for high-priced GPUs, the company also seems to recognize that it is the right time to start talking about how it also can save its customers some AI computing costs going forward.
Nvidia CFO Colette Kress, during her initial comments of the earnings call, said that Blackwell’s 25x higher token throughput on reasoning AI models comes with 20x lower cost than the previous generation Nvidia H100.
On the general topic of efficiency and cost reduction, she added, “Companies across industries are tapping into Nvidia’s [full-stack] inference platform to boost performance and slash costs. ServiceNow tripled inference throughput and cut costs by 66% using Nvidia TensorRT for its screenshot feature. Perplexity sees 435 million monthly queries and reduced its inference costs 3x with Nvidia Triton Inference Server and TensorRT-LLM. Microsoft Bing achieved a 5x speed up at major TCO [total cost of ownership] savings for visual search across billions of images with Nvidia TensorRT and acceleration libraries.”