**Andy** (0:03)
Good morning, everyone. It's time to get into AI.
701, and I'm starting the show myself because I'm alone. But you're not alone.
Come join me. It doesn't look like I'm in the right show though. I'm scheduled correctly. And well, anyway, somebody may drop in and join me, but I'm here to give you the top line news about everything AI. And today is Friday, May 29th, 2026 And we're at about, I don't know, 733 is the number of the show, perhaps, somewhere in that range. And the big news today is that Anthropic, yesterday afternoon, dropped their Opus 4.8 frontier model. Hi, Beth. Glad you could join me.
**Beth** (1:03)
I was in the wrong studio.
**Andy** (1:05)
Oh, I thought I was too, because I was alone. I'm glad we converged on each other. So anyway, I wanted to just give the highlights of what the Opus 4.8 model is, with particularly advances in coding and its ability to read and write, and I think one of the fairer evaluators out there is an organization that you have a lot of attachment to, which is Every, and Dan Shipper and his team, by the way, have published what their reaction was to this, and I'm going to read it. This is Dan Shipper's, his response to the release of Opus 4.8. He said, after a year of writing Cloud code into the rest of knowledge work, that lab, and the lab hit a rough patch, and Opus 4.7 was hard to love.
OpenAI's Codex desktop app, say that three times fast, pulled even devoted Cloud users from our team, inside every, to the GPT models. Opus 4.8, out today, he wrote this yesterday, has us running back for the model, if not the app around it. So he makes a distinction between Opus 4.8 and Codex application working with you on your desktop, as compared to the Cloud application, which splits chat, co-work, and code, Cloud code. So you have tabs for each of those three. Those are separate sessions. There's separate conversations. But if you're working with the Codex app, you're working with one conversation and it unifies that. I think eventually Cloud code will have, and Cloud as a group will have co-work capabilities and Cloud code capabilities unified in that way. But setting that distinction aside that they don't prefer Cloud for that reason, they do clearly prefer Cloud 4.8 model for its coding abilities. So it tops their senior engineer benchmark, that's an internal benchmark that they've built to evaluate these tools. It also tops their writing tests. It's the first anthropic release in a year, he said, that we reach across for coding, pros, and everyday work.
Really giving it the top of the charts, thumbs up.
Anyway, I don't know if you've got some additional points about Claude 4.8.
**Beth** (4:04)
So it's interesting because Dan wrote at the beginning of the week or end of last week, his power user Codex, this is how I'm using it. I don't know if he said in this that he had early access, they had early access. They often do, they may not have this time. But he, like, his power use of Codex includes a daily run of specific things that he wants to be briefed on in the morning.
And just one of the reasons that I follow them and him is because they're so intentional and thoughtful about the way that they're using AI.
The one knock I saw from Katie, who is one of the writers on staff.
She screenshot her something that she was writing, and she was writing about a workflow, right? That is a word that previously did not immediately make Claude think something else. And when she wrote workflow, it turned blue. And so what she was posting was, I can see that there's some adjustment that's going to need to happen here for me. And the reason that that matters is because workflow is now a trigger for Claude. That means that you want to have it initiate a workflow, which is multiple sub-agents. And Claude tasks them with what you've asked for. But they're not... Like currently, in 4.7, when you task Claude with something that it's going to spin up a sub-agent for, even if it's spinning up multiple, they all have individual goals. This is more of a team that's got the same goal, and they're going to go about it in different ways and check each other's work and come back and even argue about what the specific, right? Argue about whether that was the right call or the right approach. The idea is with that adversarial system, you're going to be able to get more done that is more complicated.
37 more minutes of transcript below
Try it now — copy, paste, done:
curl -H "x-api-key: pt_demo" \
https://spoken.md/transcripts/1000651996090
Works with Claude, ChatGPT, Cursor, and any agent that makes HTTP calls.
From $0.10 per transcript. No subscription. Credits never expire.
Using your own key:
curl -H "x-api-key: YOUR_KEY" \
https://spoken.md/transcripts/1000771241292