Breaking down the 2026 Stanford AI Index Report

**SPEAKER_1** (0:02)
Welcome to the Practical AI Podcast, where we break down the real world applications of artificial intelligence, and how it's shaping the way we live, work and create. Our goal is to help make AI technology practical, productive and accessible to everyone. Whether you're a developer, business leader, or just curious about the tech behind the buzz, you're in the right place.
Be sure to connect with us on LinkedIn, X or Blue Sky, to stay up to date with episode drops, behind the scenes content and AI insights. You can learn more at practicalai.fm. Now, on to the show.

**Daniel Whitenack** (0:41)
Welcome to another episode of the Practical AI podcast. Today, it's just Chris and I, my co-host and I, in what we call a fully connected episode where we try to keep you updated with some of the things that are happening in the AI news and maybe share some practical information that will help you level up your AI and machine learning game. I'm Daniel Whitenack. I'm CEO at Prediction Guard and I'm joined as always by my co-host, Chris Benson, who is a principal AI and autonomy research engineer. How are you doing, Chris?

**Chris Benson** (1:17)
I'm doing good. I'm excited. This is, we're doing, the episode we're doing today, we've done a number of times over the years. The Stanford AI Index Report, we get to go through it. It's always fun.
And kind of level set, kind of how things are changing. And gosh, I mean, things are changing so fast right now.

**Daniel Whitenack** (1:40)
Yeah. And for context, so some of you may or may not have listened to our previous episodes where Stanford's human-centered artificial intelligence, center, institute, I forget the exact of what they call themselves. But the human-centered artificial intelligence effort there at Stanford, they published this AI index report, and they've been doing it for a number of years. We've talked about it before. If you're interested, we're not going to go into like how it was created. It's very rigorous. It's very data driven. You can go back and listen to episode 276
We had some representatives on from Stanford that actually shared what it is, how it's created, and I'm sure that's updated somewhat over time, but that would be a great context for today. But there's a lot of takeaways here, Chris, and I think maybe we'll get through all of them. We can try rapid fire here to talk through some of these and share them with the audience and see maybe our reaction to some of these. Some of them were a surprise to me, to be honest, Chris.

**Chris Benson** (2:53)
Yeah, there always are, because, I mean, it kind of brings you back with the rigorous approach they have. We all have these perceptions. We're all watching the news and all the AI hot things that are out there.
And there's times where it kind of level sets you a little bit. And then other times, it kind of goes, and I mean, you know, just kicking us off on number one on their top takeaways list right off the bat. We kind of, this is one of those places where we were going one way. And then it didn't take the report. We were kind of we kind of realized that things were changing back. But for a while, we were pretty convinced open source models were going to completely catch up with plateau models, because that's the trend that we were seeing for such a long time. We realized a little while back that that wasn't happening for a variety of reasons, which we've actually talked about on previous episodes. But the very first thing they mention is AI capability is not plateauing. It is accelerating and reaching more people than ever.
And yeah, I think we're seeing that in 2026

**Daniel Whitenack** (4:00)
Yeah. So one of the ways they express this is that over 90% of notable frontier models were produced in 2025 And several of those now meet or exceed human baselines on a number of things. And they go into those things. Obviously, one of the things, one of the hot takes that I'm always sharing, Chris, are these baseline or these benchmarks, let's say on PhD level science questions.
They're very, they're interesting. And I think they're some somehow representative of how we're advancing, but benchmarks in general are quite flawed.
So even with that, you know, caveat in there, it does seem like there is advancement that's, that's happening. And, you know, a lot of that reaching or exceeding human level performance is, is impressive and maybe it's scary for some people. I'm not sure, but.

**Chris Benson** (5:01)

Feed this to your agent

Try it now — copy, paste, done:

curl -H "x-api-key: pt_demo" \
  https://spoken.md/transcripts/1000651996090

Works with Claude, ChatGPT, Cursor, and any agent that makes HTTP calls.

From $0.10 per transcript. No subscription. Credits never expire.

Using your own key:

curl -H "x-api-key: YOUR_KEY" \
  https://spoken.md/transcripts/1000771134499