Jensen Huang LIVE: Nvidia's Future, Physical AI, Rise of the Agent, Inference Explosion, AI PR Crisis

**Chamath Palihapitiya** (0:00)
Special episode this week, we've preempted the weekly show, and there's only three people we preempt the show for, President Trump, Jesus, and Jensen. And I'll let you pick which order we do that. But what an amazing run you've had and a great event.

**Jensen Huang** (0:18)
Every industry is here, every tech company is here, every AI company is here. Incredible.

**David Sacks** (0:24)
Incredible.

**Chamath Palihapitiya** (0:27)
If you were building a global financial system from first principles today, you wouldn't build it on 50-year-old legacy rails. You'd build Airwallex, one AI native platform for global accounts, cards, and payments. It's designed to make the entire world feel like a local market. Others are bolting AI onto broken infrastructure, but Airwallex was built for the intelligent era from day one. Stop paying the legacy tax and start building the future at airwallex.com/all in. Airwallex, build the future. One of the great announcements of the past year has been Groq. When you made the purchase of Groq, did you realize how insufferable Chamath would become?

**Jensen Huang** (1:10)
I had an inkling that...

**Chamath Palihapitiya** (1:13)
We're his friends. We have to deal with them every week.

**Jensen Huang** (1:16)
I know it.

**Chamath Palihapitiya** (1:16)
You had to deal with them for the six-week close.

**Jensen Huang** (1:19)
I know it.

**Jason Calacanis** (1:19)
It's like two weeks. Two weeks.

**Jensen Huang** (1:20)
It's all coming back to me now. It's making me rather uncomfortable. The thing is, many of our strategies are presented in broad daylight at GTC years in advance of when we do it. Two and a half years ago, I introduced the operating system of the AI factory and it's called Dynamo. Dynamo, as you know, is a piece of instrument, a machine that was created by Siemens to turn essentially water into electricity. And Dynamo powered the factory of the last industrial revolution. So I thought it was the perfect name for the operating system of the next industrial revolution, the factory of that. And so inside Dynamo, the fundamental technology is disaggregated inference. Jason, I know you're super technical.

**Chamath Palihapitiya** (2:14)
Absolutely.

**Jensen Huang** (2:15)
I know it.

**Chamath Palihapitiya** (2:15)
I'll let you take this one. Go ahead and define it for the audience. I don't want to step on you.

**Jensen Huang** (2:19)
Yeah. Thank you. I knew you wanted to jump in there for a second. But it's disaggregated inference, which means the processing pipeline of inference is extremely complicated. In fact, it is the most complicated computing problem today. Incredible scale, lots of mathematics of different shapes and sizes. And we came up with the idea that you would change, you would disaggregate parts of the processing such that some of it can run on some GPUs, rest of it can run on different GPUs. And that led to us realizing that maybe even disaggregated computing could make sense, that we could have different heterogeneous nature of computing. That same sensibility led us to melanocs.
You know, today, Nvidia's computing is spread across GPUs, CPUs, switches, scale up switches, scale out switches, networking processors. And now we're going to add Groq to that. And we're going to put the right workload on the right chips. You know, we just really evolved from a GPU company to an AI factory company.

**Jason Calacanis** (3:24)
I mean, I think that was probably the biggest takeaway that I had. You're seeing this fundamental disaggregation where we've gone from a GPU and now you have this complexion of all these different options that will eventually exist. The thing that you guys said on stage or you said on stage was, I would like the high value inference people to take a listen to this and 25% of your data center space, you said, should be allocated to this Groq-LPU-GPU combo.

**Jensen Huang** (3:50)
We should add Groq to about 25% of the Vera Rubins in the data center.

**Jason Calacanis** (3:55)
So can you tell us about how the industry looks at this idea of now basically creating this next generation form of disaggregated, prefilled, decode, disaggregated, and how people do you think will react to it?

**Jensen Huang** (4:06)
Yeah, and take a step back. And at the time that we added this, we went from large language model processing to agentic processing. Now, when you're running an agent, you're accessing working memory, you're accessing long term memory, you're using tools, you're really beating up on storage really hard. You have agents working with other agents. Some of the agents are very large models, some of them are smaller models, some of them are diffusion models, some of them are autoregressive models. And so there are all kinds of different types of models inside this data center. We created Vera Rubin to be able to run this extraordinarily diverse workload. My sense is, and so we added, we used to be a one rack company, we now add four more racks. So Nvidia's TAM, if you will, increased from whatever it was to probably something, call it 33%, 50% higher. Now, part of that 33% or 50%, a lot of it's going to be storage processors, it's called Bluefield. Some of it will be, a lot of it, I'm hoping, will be Groq processors, and some of it will be CPUs. And a lot of it's going to be networking processors.
Jensen Huang LIVE: Nvidia's Future, Physical AI, Rise of the Agent, Inference Explosion, AI PR Crisis

Feed this to your agent