**Nathaniel Whittemore** (0:00)
Today, we are looking at 51 charts that tell the story of artificial intelligence heading into next year. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI.
Now, we are in the midst of end of year episodes, which is a combination, of course, of both looking back and looking forward. And this episode is all about the charts that sit right at that intersection. There are charts that tell us where AI is today and give us some idea of what we should be planning on heading into 2026 Given that there are 51 of these things, I am going to rip through them. So buckle up and let's talk about the 51 charts that explain AI in 2026 Quick note on the production of this, the charts were all sourced entirely by me. Part of my process for preparing this show is spending a ton of time on X slash Twitter and using those bookmarks heavily. And I have a folder where I actually keep these types of charts. So step one was just going back and looking at the charts that I thought were most reflective of the current moment and had something to say about the year we're heading into. The second part of the process was outlining a somewhat rough organization of which charts I wanted to include. From there, I turned it over to Claude, ChatGPT and Gemini to see how they would organize it. I liked Opus 4.5 best, so we went with that with a few tweaks. And then I handed that and the charts off to GenSpark and Manus to put it all together. And while GenSpark looked much better, it made some really weird leaps in terms of how it was describing things and had some errors. So ultimately we went with Manus, which was then exported to Google Drive for a final edit by me. Apologies for those of you who don't care about that. I just think a lot of you are also interested in the operator and production side of AI, so I like telling you how these things get put together. All right. As you can see, we've divided this into seven categories, capabilities, infrastructure, markets, economics, vibe coding, jobs and politics. We kick off with capabilities. First chart comes from OpenRouter and is the reasoning versus non-reasoning token trends over time. You've probably seen this one a couple of times now. Basically, at the beginning of 2025, reasoning models were not yet really a thing. OpenAI had announced O1 Preview back in September, and it had finally become available at the very end of December. But we were just starting to get our hands on these things. That would change dramatically over the course of the last year, and by November of 2025, reasoning tokens represented meaningfully over 50%. This has brought with it new capabilities, new use cases, and new ways of thinking about how we scale. Our next chart is the one that for much of this year held up the entire world, it felt like. This is the chart from METER that measures the time horizon of software engineering tasks that different LLMs can complete at 50 and 80% success rates. So the task duration here is not how long the model works for independently, it's how long in human equivalent time a task can complete. Coming into this year, METER had shown a doubling of capability roughly every 7 months, but it had started to inch up to closer to 4 months, and this year reified that 4 month doubling time. In these charts, you can see the 7 month doubling line in green and the 5 month doubling line in red, and you can see how at 50% it hues really closely to the 4 month line, and at 80% it's mostly on the 4 month line with a few recent ones in between the 4 and the 7 month line. Now, whether it's 7 months or 4 months, the point is capabilities have not plateaued. They continue to increase dramatically and quickly. We are also seeing major efficiency gains. This chart shows the performance efficiency of Gemini 3 Flash, which is better performing than Gemini 2.5 Pro, which was state of the art just a few months ago, for around a third of the cost. Especially as we move into a world where production workloads are getting bigger and bigger and we are consuming more tokens, the fact that it's not just capabilities, but also efficiency and costs that are improving is a big deal. Another measure of the efficiency gains came with 5.2's performance on the Arc AGI 1 exam. The Arc AGI benchmark folks noted that between a tweaked 3 model last year and GPT 5.2 this year, there was a 390% efficiency gain in a single year. Now what this all adds up to in terms of when we get AGI is kind of anyone's guess. As you can see from this chart, people are all over the place in terms of when they think we're actually going to get AGI. By the way, there's no common definition of AGI, and there are even plenty of folks out there who think that the term is getting more and more meaningless. One interesting note is that I think that if anything, people's timelines actually got moved back slightly heading into 2026 from where they were heading into 2025, despite all these capability gains. Andrej Karpathy in particular in a big interview he did might have single-handedly set back the timeline a couple of years.
20 more minutes of transcript below
Try it now — copy, paste, done:
curl -H "x-api-key: pt_demo" \
https://spoken.md/transcripts/1000651996090
Works with Claude, ChatGPT, Cursor, and any agent that makes HTTP calls.
From $0.10 per transcript. No subscription. Credits never expire.
Using your own key:
curl -H "x-api-key: YOUR_KEY" \
https://spoken.md/transcripts/1000742645591