**Eric Siu** (0:00)
I was spending over $5,000 a month on AI agents. I did a full cost audit and cut it to $800 without losing a single capability. We're talking 84% reduction and the same output. The biggest waste was something I never would have caught without actually looking at the numbers. I'm gonna show you exactly what I found, what I cut, and the framework I used to do it.
So clearly you can see on my screen now, the token costs are coming down significantly versus what they were before, right? So the easiest way to see how much you can save with AI is to just ask AI. So this is my OpenClaw instance, I'm working with it in Telegram. And I just asked, I was like, hey, tell me how much money I save for each of these, and like you did with number one, I said, give me some sexy points for each of these, okay? One of the big things that you can do, if you're paying for ChatGPT, so I pay for Claude, I pay for Claude Max, which is 200 bucks a month. So I happen to pay 200 bucks a month for a ChatGPT as well, just so you all know. And so you wanna have both pro versions, I think, so then you can take advantage. And guess what? OpenClaw is now owned by OpenAI. With OpenAI, you have the ability to plug in your OpenClaw with OAuth. That means that you're not running on the API token. So I am testing this right now on the different agents that we have. So I have Phoenix and I have Oracle. Phoenix is my autonomous agent, and Oracle is my SEO agent, okay? Just by doing that, I'm saving $1,000 to $1,700 a month. I am now considering putting some of my other agents onto ChatGPT 5.4. So there are trade-offs to all this. I would say ChatGPT 5.4 is better at coding. Now, when it comes to creative thinking or creative writing, Opus is still better at that. Okay, so you have to figure out what works for you. I still have my primary agent, my chief of staff, Alfred, still on Opus, okay? But I'm looking at slowly moving everything over to ChatGPT 5.4, because then I can run on OAuth, then I save a lot more on tokens. That's a big one. Now, the second thing is, when you are looking at the different models that you're running, we have different cron jobs running with my OpenClaw. So for business purposes, I might have a cron job that is running every day or so to check in on new sales leads that we should be reaching out to or deals that we should be reviving. Or every week, maybe I'm getting ideas from competitive YouTube channels on packaging I should be using. There's a lot of these cron jobs, chronological jobs that you set, that you want them to run well. But here's the thing, if you're running a cron job on one of the latest, most expensive models, that's obviously going to cost you more money. Now, in this case, if I switch from Opus to Sonnet, that's actually saving me $630 a month. That means like $7,500 a year, something like that, right? We went from it costing $2.50 and getting it down to $0.40 per run. So, we're talking over time, we're talking about 84% less spend, and that's basically, you're talking about firing expenses, and you want to be looking at getting more efficiency while cutting on your expenses. So, let me go a little more into details here and share an example. So, with Opus and Sonnet, we actually had a recruiting cron job that was running, I think, every 30 minutes or so, and it was running on the most expensive model, and that was costing us a lot of money. And this estimated that we had saved at least $1,000, or maybe even $2,000 a month from that, okay? So, it's already saving a ton of money. So, you have to think about using the right models for the right situations, and that's how you're going to save a lot of money. By the way, if you're watching this right now, okay, we're talking about this. Mostly, a lot of this is from a business standpoint, but that's how you should be thinking about it. So, let me give you kind of a midway takeaway here. I think if you want to save on these tokens, if you're running this for business purposes, you're trying to grow your business, you're trying to build agents to automate a lot of the work away, you should be paying, again, for ChatGPT, the $200 version, you should be paying for Claude, the $200 version. That way, you're able to use their pro versions and get your most bang for your buck. An example here is this. When I use Cursor and Claude Code, when I'm using it to build more deeper product builds, like a dashboard, for example, or maybe I'm looking to build a mobile app, that's a deeper build that requires more engagement from my side. That's where I would be using Claude Code Max, or more so Claude Max, the $200 month, so I could get that $5,000 in value, okay, on the tokens there, I could do that. That's why I think it's helpful to actually have both. And then obviously with Gemini, they have the best image models out there with Nano Banana Pro, right? So you have to understand what you're trying to accomplish, and then you can figure out what stack works the best for you. And then I would say every month or so, you should just probably run this. Like this, me trying to save money, me auditing my crons, this is what you should be doing, okay? Run this like clockwork every single month. In fact, take a screenshot of what I have on my screen right now. Just say, go to whatever you're using right now. Maybe you're using Claude Code, okay? Maybe you're using one of the Claude, say, hey, I want to save money on my stuff right now. I want to save money on my bills. How do I do this? And just let it ask you clarifying questions on how you can do it, and then you just start saving money, okay? It's the concept that matters. Now, on to number three here, Sonnet by default across the fleet, okay? So this will save us $300 to $500 a month or so. Sonnet is part of Claude. It's a lower version, okay? And Sonnet is already, the latest version of Sonnet was as good as the last version of Opus. That's pretty damn good. And these token costs are going to continue to decrease. By the way, if you want to learn how to decrease your costs by maybe 95%, stay till the end. So, number four over here, Cron Audit. So, every week or so, we have my machine automatically running a Cron Audit, which is, it's killing nine dead jobs, it's cutting compaction frequency in half. And so this will save minimal money. And then you also, like, I like tracking my savings from a cost saving standpoint, and then I like measuring it against the screenshot that I had from the beginning of this video. Overall, I mean, this is being conservative. We're talking about saving three, four grand a month or so, okay? 24 to 35 grand a year. I think we're going to be using more tokens. So, I don't know if the savings is going to realize, or we're just going to end up buying more tokens, okay? So, I actually, I asked one more question over here, just to give you guys a sense. I was like, hey, because I'm going to go record a video right now, guys. Can you give me more sexy numbers we can talk about? So, two more worth mentioning for you all. Number six, self-healing cron doctor. So, this runs four times a day. So, it caches broken crons before they burn tokens, retrying and failing. My cron reliability went from 50 to 85%. Every failed run is wasted money. This is the janitor that pays for itself. If you're damn right, it does, okay? Browser uses cloud API instead of local. So, we're running LinkedIn, recruiting Apollo polls, web scraping through a cloud API instead of speeding up local browser automation. Pennies per task versus running headless Chrome 24-7. Quick break. If you want to run personalized LinkedIn ads and have personalized landing pages to convert your customers at a much higher rate on LinkedIn, check out Carrot. That's K-A-R-R-O-T dot A-I. Carrot allows you to do things that LinkedIn ads does not allow you to do right now. Check it out. There are publicly traded companies such as SEMrush and Sitecore that are using this right now. And you can use this to get ahead, because if you don't use this, it will take you weeks, if not, even a little longer than that, to make these ads. Because again, you cannot do this on LinkedIn. So again, go to www.carrot.ai, and we'll see you on the other side. All right, so if you guys want to save 95% on cost when it comes to A-I, you gotta look at something like this, okay? Am I saying you need to buy a Mac Studio? Not really, but kind of. So I think you need to have local infrastructure, okay?
3 more minutes of transcript below
Try it now — copy, paste, done:
curl -H "x-api-key: pt_demo" \
https://spoken.md/transcripts/1000757556030
Works with Claude, ChatGPT, Cursor, and any agent that makes HTTP calls.
Get the full transcriptFrom $0.10 per transcript. No subscription. Credits never expire.
Using your own key:
curl -H "x-api-key: YOUR_KEY" \
https://spoken.md/transcripts/1000757556030