How People Actually Use AI Agents

**Nathaniel Whittemore** (0:00)
Today on The AI Daily Brief, a new study about agent autonomy and practice from Anthropic. And before that, in the headlines, Google Gemini now allows you to create music. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI.
All right, friends, quick announcements before we dive in. First of all, thank you to today's sponsors, KPMG, Assembly, Robots and Pencils and Blitzi. To get an ad-free version of the show, go to patreon.com/aidailybrief, or you can subscribe on Apple Podcasts. To learn about sponsoring the show, send us a note at sponsors at aidailybrief.ai. Lastly, a reminder once again about our latest ecosystem projects. Claw Camp, the free self-directed program where you can learn how to build agents and agent teams using OpenClaw, is kicking off its first sprint right now. So if you want to learn to be an agent boss with nearly 3,000 friends, come join us. You can find that at campclaw.ai or from the AI Daily Brief website. If you are a company trying to figure out OpenClaw and other agent strategies, check out enterpriseclaw.ai. And more broadly, if you are just interested in keeping track of these types of educational programs we're doing, free and premium, you can find information about that at aidbtraining.com. One thing happening there, we actually have a premium program survey as we are trying to figure out exactly which premium programs to launch first. If you are an enterprise or premium buyer who is interested in that, again, you can check it out at aidbtraining.com. Now with that barrage of URLs out of the way, let's talk about Gemini and music.
Today we kick off with Google's continuing quest to have AI products in every single multimodal category. The latest news is that the company has launched an AI music generator called Lyria 3 It's the latest version of DeepMind's music generation model and allows users to generate music clips based on text, images, or video inputs, which is pretty unique compared to something like Suno, which is of course just text-based input. Lyrics can be generated in 8 different languages including German, French, Spanish, and Hindi. The feature can be accessed directly in the Gemini app by switching to a musical output. It's also being added to YouTube's Dream Track tool to allow creators to quickly generate soundtracks for YouTube shorts. Each track is accompanied by custom cover art generated by Nano Banana. Now previous versions of Lyria have only been available through Google's Clouds Vertex program so this is a big expansion in access. However, there is a pretty significant limitation which is that these are 30 second clips. The model itself isn't really capable of building on top of the initial generation so this feature won't be useful to generate entire songs. However, and it's pretty clear that this is the use case they're imagining initially, this could be extremely useful for generating background music for YouTube shorts or fun interactive personal types of song messages. Indeed, this appears to be what Google had in mind with them writing, The goal of these tracks isn't to create a musical masterpiece but rather to give you a fun unique way to express yourself. And really, while it would be tempting to compare this to Suno, this is actually more of a social feature than anything else. We've talked in the past about how one of the really interesting things about Suno is the extent to which it is used not for any sort of professional or work music generation but just as a fun interactive mode, and Lyria really seems to be doubling down on that. Google has also embedded their SynthID audio watermarks into the music so they're easily flagged as AI generated. A lot of the discourse around the first tries is that this is indeed not Suno, and that Suno's generations feel much more polished and musically complex. On the flip side, others point out how Google just keeps adding new arrows in its multimodal quiver. Aaron Upright comments, People talking about OpenAI vs. Anthropic and Gemini just overhear quietly getting more powerful. People underestimate the importance of an easily accessible multimodal platform when it comes to adoption. Tian Zhao sees the future writing, Video to audio alignment is the real flex here. Generating lyrics and vocals that actually sync with visual cues in real time is a massive multimodal serving challenge. Lyria 3 probably relies on some crazy high-throughput infrared to keep the latency low enough for creative workflows. Ultimately, I think we are just scratching the surface on what role generated music is going to play, and Google is now firmly in that game as well. Next up, a bit of a controversy that ended up being less of a controversy than it seemed but still taught us some interesting things around the state of competition. A change in Anthropic terms of service triggered a tinderbox of complaints from those using Clod to power their OpenClaw agents. This week, Anthropic changed their policies, now stating, using OAuth tokens obtained through Clod-free, Pro or Max accounts in any other product, tool or service, including the Agent SDK, is not permitted. Now to clarify, OAuth tokens are kind of like API keys for regular Anthropic subscriptions, allowing users to access AI models through third-party apps. And, of course, a lot of the attention is around the people who have been using their Clod-Max accounts to power their OpenClaws. Indeed, Alex Finn writes, This is going to piss off a lot of OpenClaw users paying $200 a month. The tweets like this one were too numerous to count. Hubert Lepicki writes, Anthropic is in an active self-destruction mode now. First they went after tokens you already paid for blocking use in non-Clawed code apps, then they sent their lawyers after developers for supposed branding infringement, and now this. OpenCode, Gemini CLI, Codex CLI, are all legitimate coding agents with comparable features and abilities, but Anthropic are behaving like they're still the only player on the block. Now, all of this caused Anthropic's Tariq Shihapar to comment writing, Apologies, this was a docs cleanup we rolled out that's caused some confusion. Nothing is changing about how you can use the agent SDK and Mac subscriptions. He added that the intention isn't to block personal tinkering, but rather to force third-party businesses to pay for usage through the API. Unfortunately, the confusion only continued with that unclear clarification. Podcaster Felix Javin wrote, Brother, can you just tell us whether we can use OpenClaw or not? And it seems like if you're using it to build your own personal agents, the answer is yes, but the incident raised a ton of discussion about how long the big AI labs will continue to support these modular AI use cases. Some tried switching providers only to find that Google had already banned OAuth for the use case. Richard Holcomb wrote, I feel like getting banned by Google for using anti-gravity OAuth with OpenClaw is a rite of passage. I was already not impressed with Gemini, preferring Anthropic and OpenAI, but now I really have a bad taste in my mouth. Colin Darling, however, pointed out, Everyone upset about Anthropic's update to their terms would be wise to read the OpenAI and Google Gemini terms while they're at it. I'm bummed out too, but Anthropic is late to this party not leading it. In any case, the controversy quickly faded, but there is a lingering question about walled gardens and what they're going to mean for AI going forward. Next up, more news in the AI wearables category. Meta has revived plans to release a smartwatch as part of their AI device lineup. Rumors of a Meta smartwatch started circulating in late 2021, complete with leaked photos of a prototype. The device was given the internal code name Malibu and featured two cameras, one in the dial for video conferencing and another on the underside of the watch. The idea was that users could quickly remove the watch to take a photo. Another big part of the design brief was the ability to read nerve signals in the wrist, allowing the device to be used as a controller. This concept has since gone on to feature in Meta's app to control wristbands for their Orion Smart Glasses prototype, which was unveiled in late 2024 That said, by the summer of 2022, Project Malibu was killed off and Meta shifted focus to smart glasses as their big wearable play. Now the information reports that Meta has revived the smart watch under the code name Malibu 2 The watch is said to include health tracking features and a built-in Meta AI assistant. Sources said the revival effort came out of a project strategy meeting late last year. Executives are reportedly concerned about a bloated product lineup for augmented reality glasses so have delayed some products to focus on a limited number of concentrated bets. Among them is a new version of the Ray-Ban displays, which is expected later this year, as well as a pair of AR glasses which could arrive in 2027 The smart watch is planned for release this year, putting Meta in direct competition with Apple and Google in the category. One thing to watch for will be how far each company goes in making the smart watch an integral part of their wearable AI stack. Earlier this week, we covered rumors that Apple was working on a trio of new AI-enabled devices, namely smart glasses, a pendant, and a camera-equipped version of the airpods. That report mentioned that a camera-equipped version of the Apple watch had been passed over as an AI device, with testers reportedly finding the prototype impractical due to clothing sleeves obscuring the camera. Ultimately, we don't know how Meta is thinking about the Malibu 2, but they are very clearly focused on this wearable category as a place for their AI strategy. Next up, another follow-up in the Grok 4.2 public beta. XAI has announced a new version of Grok Heavy, and this one goes to 16 The big innovation with Grok 4.2 was the inclusion of four sub-agents to debate responses before providing a final answer. Opinions were a little mixed on whether this was actually a useful feature, but it's an interesting experiment if nothing else. Grok Heavy turns the sub-agent count all the way up to 16 in a bid to either get better answers or at least burn through a ton of tokens getting an output. XAI community promoter Ted Suo shared an output from the query, How does chaos birth cosmic order? The agents debated the response for a little over a minute and then delivered a 700-word report using almost 900 references. It's difficult to judge accuracy or usefulness based on such a strange subjective question, but the output certainly has a ton of detail and is an interesting read. If nothing else, these continue to be interesting experiments and worth watching for that reason alone. Lastly today, Chinese models. Lindy founder Flo Cravello recently shared a thread about the difference between Chinese models on benchmarks and Chinese models in the real world. He wrote, By far our biggest cost at Lindy is inference, so believe me when I say we've looked at these models very closely and continued doing so. They are actually delivering on the claims would make a material difference to the business. But every time we've evaluated them, we've found the same thing. That their real-life performance for agentic behavior and outside of coding use cases falls extremely short of what they show on the evals. I think the industry consensus is right, he continues. These Chinese labs are 1 Distilling frontier models, duh, which leads to a more shallow intelligence, 2 Training for evals, 3 Potentially stealing weights. Not saying these models will always be bad or that these labs are completely incompetent. They're doing a fine job, but it's delusional to think they're actually at sonnet and opus level. They're still at least one generation behind. Take the evals with a huge grain of salt. That I think is a lesson that is relevant not just for Chinese labs but also whenever you see a new Western model as well that has high benchmarks. Ultimately you gotta just dive in and test these things out for yourself. And with that we will end today's headlines. Next up, the main episode.
How People Actually Use AI Agents

Feed this to your agent