June 05 2026 - Bots Outpace Humans: Pay-to-Crawl Web & Local AIBots Outpace Humans: Pay-to-Crawl Web & Local AI artwork

June 05 2026 - Bots Outpace Humans: Pay-to-Crawl Web & Local AIBots Outpace Humans: Pay-to-Crawl Web & Local AI

The AI Signal & The AI Noise

June 5, 2026

Cloudflare warns bot traffic now outpaces humans, pushing a 'pay-to-crawl' web. We cover on-device agent tech like OpenJarvis (near-cloud accuracy at tiny cost), why companies aren't seeing promised AI savings, and Apple approving Poke for Messages - agents go mainstream.
Speakers: Taylor
**Taylor** (0:00)
Welcome back to AI Signal & Noise. It is Friday, and we have some absolutely wild stories to get into today. Dude, the internet is changing so fast. And honestly, it is getting a bit weird out there.

**SPEAKER_2** (0:16)
Hey, everyone. Yeah, today we are looking at how bots are taking over the web, why corporate AI isn't saving companies any money, and some really cool on-device agent tech. Ready to dive in, Taylor?

**Taylor** (0:30)
So ready. First up, did you see what Cloudflare's CEO just said about the future of the internet? I saw this on the Decoder and it completely blew my mind, dude.

**SPEAKER_2** (0:41)
Oh, Matthew Prince? Yeah, he says bot traffic is officially outpacing human traffic online now.
That is crazy because it is happening years ahead of his original late 2027 forecast.

**Taylor** (0:54)
Exactly. He blames AI agents for this massive surge, and because of that, he says the future of the web is going to be pay to crawl, like no more free scraping for AI models.

**SPEAKER_2** (1:07)
Wait, pay to crawl? That is a massive shift. So search engines and AI companies will have to pay websites just to index their content? How would that even work in practice?

**Taylor** (1:19)
Right. Right now, bots just scrape everything for free.
But if they are hogging all the bandwidth and not bringing real human visitors to the site, why should publishers allow it anymore?

**SPEAKER_2** (1:32)
Honestly, I get his point. If 90% of my traffic is just LLMs training or agents fetching data, they are just draining my server resources without giving any actual value back.

**Taylor** (1:45)
Exactly. Why should creators pay for hosting just so some massive AI company can scrape their work and keep users on their own platform? It is totally unfair to the creators.

**SPEAKER_2** (1:58)
But wait, what about smaller creators?
If only big players can afford to pay to crawl, do small sites just become completely invisible to AI search engines? That is a scary thought.

**Taylor** (2:12)
Dude, that is a really good point. If you cannot afford to get crawled or if you block bots, you might just disappear from the modern web entirely.
It is wild to think about.

**SPEAKER_2** (2:25)
It feels like the open web is fracturing. We might end up with a web of toll booths where only the wealthiest AI companies can access the best data.
Not great for the open Internet.

**Taylor** (2:37)
It is wild to think about. But hey, speaking of AI agents running around, there is some crazy new open source tech that lets you run them completely locally on your own devices.

**SPEAKER_2** (2:49)
You mean OpenJarvis. I saw that research from Stanford. They released a framework that runs inference, agents, memory, and learning entirely on device. But is it actually good or just hype?

**Taylor** (3:03)
Dude, yes, it is so cool. They decomposed the system into five primitives, intelligence, engine, agents, tools, and memory, and learning. And it is incredibly efficient, running entirely on your local hardware.

**SPEAKER_2** (3:19)
Okay, but local models are usually way less capable than giant cloud models like GPT-4, right? How does OpenJarvis actually compare to the big players we use every day?

**Taylor** (3:31)
Get this, it lands within 3.2 points of the best cloud models. And the marginal API cost is roughly 800 times lower. Like that is an insane saving.

**SPEAKER_2** (3:45)
800 times cheaper?
Wow, I guess that makes sense since you're not paying for constant cloud API calls. But what kind of hardware do you need to actually run this?

**Taylor** (3:56)
They designed it to run on personal devices. The whole goal is to let your phone or laptop learn your habits locally without sending your private data to some corporate cloud server.

**SPEAKER_2** (4:09)
Now, that is a feature I can get behind. Having your personal memory and learning stay on device is huge for privacy.
I do not want Tag Giants tracking my daily routine.

**Taylor** (4:20)
Exactly. Imagine an assistant that knows everything about you, but it is completely secure. It can use tools and learn on the fly.
It is the dream, dude. Seriously.

**SPEAKER_2** (4:33)
It is definitely a step towards true personal assistance. But speaking of agents, it seems big corporations are trying to use them too, though with some pretty mixed results lately.

**Taylor** (4:45)
Oh, man. Are you talking about that Bain study?
Dude, this one is so funny. Apparently, companies are missing their AI savings targets because humans keep getting in the way.

**SPEAKER_2** (4:57)
Yes. Bain surveyed almost a thousand companies, and nearly 40% achieved less than 10% in AI cost savings, even though they targeted much higher numbers.

4 more minutes of transcript below

Feed this to your agent

Try it now — copy, paste, done:

curl -H "x-api-key: pt_demo" \
  https://spoken.md/transcripts/1000651996090

Works with Claude, ChatGPT, Cursor, and any agent that makes HTTP calls.

From $0.10 per transcript. No subscription. Credits never expire.

Using your own key:

curl -H "x-api-key: YOUR_KEY" \
  https://spoken.md/transcripts/1000771260040