**Julian Goldie** (0:00)
Minimax M3 plus Hermes AI just changed everything. Let me show you what I mean. So for example, I've built out of this mission control with Minimax Studio built in. And what this means is I can just generate images, I can generate videos and voice in one single tab in this beautiful mission control, like you can see. And then also, this is super fast. So for example, if we just give it a random example, okay, create a dragon flying over Tokyo. As an example of that, that's going to generate an image and it creates it very quickly on my own tests directly inside this studio. And so recently I built out this agent operating system, something I've been waiting for for a long time called my agent OS. And this week, MiniMax and Hermes made it 10 times more powerful. So let me walk you through it because I think it changes how you think about AI for your business. Here's a big idea first. This is one thing I want you to segue. By the way, look at that image. How good quality is that? And then also you got the videos that you can generate here as well. So if we play these back, let me show you the quality of these videos is absolutely wild, right? Really nice. So basically, the way that I'm looking at this is most people are using Hermes as a chatbot. Let me show you an example. So if we pull up Hermes inside the chat here, most people are just going back and forth with Hermes inside the terminal. And it doesn't look anywhere near as good. So I started to think about AI agents as really my team that I could build around this whole process. And if you stop thinking about Hermes as this sort of terminal and start thinking about how can you control it? How can you build out a team of AI agents that can create images, videos, podcasts, even using this whole setup and then save everything in one place so that you can preview it and use it whenever you want to? Well, that's going to make it way easier and more powerful and more autonomous, right? So the old way, for example, and this is probably, you know, the way that you've been using AI without even realizing is, you know, the old ways is you go into, for example, your terminal and you open up Hermes, you type a question, it answers, you might get it to do something and then you might paste that output somewhere else or you try and open up and find her. Maybe you do the next bit, you copy that, you paste it again. But you're the glue, right? You're running between 10 different apps. Maybe you got Hermes, maybe you got Claude somewhere else, maybe you got ChatGPT somewhere else as well.
And so you're running between like 10 different apps, holding it all together with your own two hands. And the second you close the tab, the AI really does forget everything. You know, it's got a memory, it's not that great. You start from scratch every single time. And that's not really having a team. That's just kind of like a vending machine where you put in a question and the answer falls out. And it forgets you the moment you walk away. The new way is completely different. This is where an Agent Operating System comes in. So picture a mission control, one screen, and on that screen, all your AI workers sit side by side. OpenClaude, OpenClaude, Hermes, they all live in one place, they all share the same memory. They actually know what the others are doing. They're a team, not a pile of strangers. And that's the Agent Operating System. So you have one dashboard, everything is ready to go. You don't even need to type Hermes into the terminal. And then this is where it all changes because with the new model, Minimax M3, they literally just dropped about 24 hours ago, something like that. We've plugged it in to this Agent Operating System and turned it into a proper studio that runs itself. And here's why it's a big deal. So Minimax M3, it just dropped on June the 1st, right? It really just came out and it has a giant memory. So M3 can hold about a million words in its head at once, a million. So imagine, for example, having a stack of books taller than you in front of you and it remembers every single page whilst it works. So if you look at that versus smaller models, this is way more powerful because it has a much better memory, it remembers every single page whilst it works. The old AI was a bit of a goldfish. This one never really forgets your stuff. Now, why does that matter for you? Well, you can hand it the whole thing or you notice every past email, your entire content folders. It can work with the full picture, not a tiny slice. So for example, like a freelancer could drop in every client brief they've ever written and ask it to find what their best projects had in common. Number two, and this is where it gets really powerful, is that Hermes, Agent can create images, it can create videos and it can create sound. So it's not just a writer, it can look at things, it can create things, it can build things, and it can turn Hermes from like a basic sort of chat interface to something that's a full studio. Not only that, but you've got the power of, for example, like Hermes Goal Mode. Now this is really important here as well because Minimax M3 in particular can work on its own for hours. And the tests actually found that Minimax could run for 24 hours autonomously on the benchmarks. You can see that case study right here. Now, I want to be careful and honestly, because these numbers come from Minimax's own tests, not from outside checkers, but here's what they showed. They handed M3 one brutally hard job, no answer to copy. They just said, go figure it out. And it ran on its own for about 24 hours straight. It made 1,959 separate moves. It tried 147 different times with no human intervening in the whole way. So 24 hours alone, nobody in the room. Now, here's a part that I keep thinking about, which is, you know, if you're using ChatGPT, or even if you're just using Claude most of the time, most other AIs give up. The Minimax team said it plainly. Most models will quit after about 30 tries when they're trying to stop on their own. M3 didn't. Its best answer didn't show up, actually, until they tried number 145 So it hit wall after wall, got stuck again and again, and just kept going like a refuser who refuses to clock out until the job is probably done. Now, think about how rare that is, not just in AI, but just in the world in general. Who's going to work for 24 hours straight and get stuff done like that? Because there's very few things, right? So why should that matter to you? Because the jobs that eat up your week aren't the quick ones. They're the long, grinding ones, sorting years of data, going through a giant spreadsheet line by line, cleaning up a huge messy document. Those are exactly the jobs that need a worker who won't quit at try number 30 So for example, you could hand it like a year of messy invoices and just let it sort and check that, you know, every single one overnight. And that's the difference. So the old AI was a sprinter, good for a quick burst, then done. But M3 is actually a marathon runner. Let's test another thing. So we're going to create a video, a cinematic aerial flyover of a dragon flying over a growing futuristic city at night. And we'll hit generate on that and then we'll start generating the video. Now, whilst we're waiting for that, let's talk about this, because this is where it gets fun. You know, a smart brain on its own just sits still there, right? You have to talk to it. You have to babysit it. And that's still the old way. But when you actually think about this, what you want is a worker. You give it a job, you walk away, you come back to finish work. And that's where Hermes comes in. So Hermes is an AI agent, M3 is the brain you plug in. Hermes is the body the brain thinks the body goes and does the work. It can open files, it can run tools, it can make the thing. And it keeps going on its own. And so basically what I did here is I took the brain, plugged it into Hermes, and now both of them sit inside my agent OS, sitting next to Claude and OpenClaude, all sharing the same memory as well. And this is one of the other cool things about using M3 in particular, is because it's a very powerful model with a million-token context window, it holds a lot of memory, it works very well with memory. And so it can, for example, update my notes, organize my memory vault, and then also it can organize those notes into a beautiful system like you can see right here. Not just that, but this is where it gets really interesting too. So you can turn this full setup into a studio as well. And the big difference here is like, if you look at the old way, how are you going to find the images you created? How are you going to find the video that you created? How are you going to see all the projects you've created inside Hermes, directly inside the terminals? It's going to be very hard for you, right? Whereas, for example, if you go inside the workspace that we've set up in the Agent OS, you see how you can see all the projects that we've created previously. We can, for example, preview videos that we've created previously. We can see the images of everything that was built out. And so it's just a lot more powerful because you can access everything. You save time in one single place. And so inside Hermes, what I actually did was just build a simple box and you can type what you want and make it. So that could be an image. It could be a short video. It could be a voiceover. And then it saves everything into neat folders for you. So when we create all this stuff, not only can we preview it between the tabs, but also we can go to the workspace and see everything inside the individual folders, which is great too. And so in one window, you could, for example, write a script, generate a voiceover, make images, et cetera, all in one place, all connected, all saved automatically. And before this, you'd need like five different apps and five different logins to do it. Now it's inside one dot box, inside one dashboard, running on one brain that doesn't forget and a body that doesn't quit. That's the Agent Operating System. And it's really the new way here. Now, if you're sitting here thinking, this sounds amazing, but there's no way I could set this up. I'm not techie. I hear you. That's the number one thing most people tell me.
5 more minutes of transcript below
Try it now — copy, paste, done:
curl -H "x-api-key: pt_demo" \
https://spoken.md/transcripts/1000651996090
Works with Claude, ChatGPT, Cursor, and any agent that makes HTTP calls.
From $0.10 per transcript. No subscription. Credits never expire.
Using your own key:
curl -H "x-api-key: YOUR_KEY" \
https://spoken.md/transcripts/1000770853655