Gemma 4: Run Hermes Free Forever! artwork

Gemma 4: Run Hermes Free Forever!

AI News Today | Julian Goldie Podcast

June 4, 2026

Run Gemma 4 Locally in Hermes Agent (Free AI Automation + New Web UI Setup)The script explains how to combine Google’s newly released Gemma 4 12B model with Hermes Agent to run a free, local AI agent designed for agentic reasoning and automation.
Speakers: Julian Goldie
**Julian Goldie** (0:00)
Gemma 4 and Hermes Agent, you can automate anything for free now. So you can plug Gemma 4 12B, that just dropped today, directly into Hermes Agent. Now, why would you do that? Because with Hermes Agent, this is an AI agent, and you can plug in the brain like Gemma 4, and then it's free, it's local, and it's designed for agentic reasoning. So this is a powerful way to combine the best of both worlds, a free model with one of the most powerful AI agents in the world. If you're wondering how do you get this set up etc, I'll show you exactly how to build with it, and what we've created using this, plus some examples of agentic tasks. If you're wondering, okay, what can you build with Hermes Agent? Here's a bunch of examples. We actually built out a game, quite a few games, we created like a Pomodoro timer here, a color palette, even like these animations, which is pretty cool. So let me show you these. These were all built with our agentic operating system and Hermes Agent combined, like you can see. Now, if you want to get this set up, basically what you can do here is you can go into and Ollama have the latest model. So all you do is you download, then you're going to run it and then you can run this command, which is Ollama launch Hermes model, Gemma 4 and get started with this. And you can see this model is ready to go right there. So that's how you can plug it into Hermes agent pretty quickly. Also, if you're wondering, so basically there's a new web UI with Hermes agent that makes this even easier. So if you don't want to use this terminal, you can just go to models over here and then you can change the main model and you can select Ollama with Gemma 4 once you've got it installed and downloaded, which is pretty cool as well. So you can configure this. Now, the way that I would actually configure this, and this is using the new web UI from Hermes agent, which makes it even more easy and powerful to manage, is with this setup, you could have your main model as, for example, like step 3.7 flash, and then you can actually set up Gemma as the sub agent, right, because Gemma 4 is not like a super high advanced reason model, but it is fast, it is free, it is local. And so you would save on tokens if you had it set up as a sub agent. And then Hermes Agent could have a main model, which is like the agentic reason model. And then it could use Gemma 4 as the local model for sub agent tasks, the smaller stuff basically, for auxiliary tasks, which is pretty cool. Now, for example, we can also use something called Gemma 4 26B. So if you don't have an amazing setup and you can't run Gemma 4 locally with Hermes Agent, then what you can do instead, and this is pretty cool, is you can get a free API.
And this is Gemma 4 with 26B that you can get for free. And you can also get 30 as well. So you can grab the API for this, and then you can plug it into Hermes Agent. So for example, if we go into the agentic OS, and then we configure an auxiliary model or a main model, we could go down the model list, go to OpenRouter, and then we can select Gemma 4 from the list right there. And you can see, for example, we've got it ready to go. So we can select that as the main model, and now Gemma 4 is ready to go on now. Right, and we've also got 31B here too. Both of those are free APIs that you can use. So if you don't have a local set up, no problem. You can use a free API with Gemma 4 too. Now, if you're wondering, okay, what can you build with this? How can you use it, et cetera? Let me guide you through that. So basically, the way that I would look at it is that two halves, right, on their own. So Hermes is the body. It's always on, it's living on your machine. It's hands on, it's working 24 seven. Gemma 4 is the brain. So Google's brand new model that runs for free. And you bolt them together and you get an AI agent that quietly could run your morning brief or clear your inbox or write into your own notes or research for you. Even build working apps. Everything below is some examples of what we've created using this.

8 more minutes of transcript below

Feed this to your agent

Try it now — copy, paste, done:

curl -H "x-api-key: pt_demo" \
  https://spoken.md/transcripts/1000651996090

Works with Claude, ChatGPT, Cursor, and any agent that makes HTTP calls.

From $0.10 per transcript. No subscription. Credits never expire.

Using your own key:

curl -H "x-api-key: YOUR_KEY" \
  https://spoken.md/transcripts/1000771219084