Podcast transcript API for AI agents

Your agent can browse the web, read docs, and write code — but it can't listen to podcasts. All that insight, locked in audio files. Give it any episode as clean Markdown, with real speaker names, in one API call.

Try it nowsearch for any episode or use API key pt_demo with any endpoint. No signup needed.
$0.10 per transcript. Key delivered instantly — your first transcript in under 30 seconds.
Credits never expire · Errors are never charged · If it doesn't work, you don't pay 2,000+ transcripts fetched by developers building summarizers, RAG pipelines, and podcast tools

Skip the transcription pipeline

Download audio Convert format Run Whisper Bolt on diarization Chunk for LLM Manage storage

You started building a podcast tool — and spent a week on infrastructure instead of your product.

spoken.md replaces that entire pipeline with one API call. You get speaker-labeled Markdown with real names — ready for your LLM's context window.

Real speaker names, not "Speaker 1"

Figuring out who is speaking is harder than transcribing the words. Most tools punt on this — you get "Speaker 1" and "Speaker 2" and a manual cleanup step.

Whisper + diarization

Speaker 1 (0:00)
Welcome to the podcast, where we discuss science and science-based tools for everyday life.

Speaker 2 (0:45)
Thank you for having me. Sleep is one of those things where small changes can have outsized effects.

Spoken

**Andrew Huberman** (0:00)
Welcome to the podcast, where we discuss science and science-based tools for everyday life.

**Matt Walker** (0:45)
Thank you for having me. Sleep is one of those things where small changes can have outsized effects.

No post-processing, no manual cleanup, no guessing.

How it works

Two API calls. Search by text or paste a URL from Spotify, YouTube, or any podcast app — then fetch the full transcript.

# 1. Search for an episode
curl -H "x-api-key: pt_demo" \
  https://spoken.md/search?q=huberman+sleep

# 2. Get the transcript
curl -H "x-api-key: pt_demo" \
  https://spoken.md/transcripts/1000651996090

Response is text/markdown with speaker names and credit info in headers:

HTTP/1.1 200 OK
Content-Type: text/markdown; charset=utf-8
X-Credits-Remaining: 99
X-Credits-Charged: 1
**Andrew Huberman** (0:00)
Welcome to the Huberman Lab podcast,
where we discuss science and science-based
tools for everyday life. Today my guest is
Dr. Matt Walker, professor of neuroscience
at UC Berkeley and author of Why We Sleep.

**Matt Walker** (0:45)
Thank you for having me, Andrew. Sleep is
one of those things where small changes to
your routine can have outsized effects on
both mental and physical health.

...

Seen enough? Your key is ready in 30 seconds.

Agent integration

# Agent skill (Claude Code, Cursor, Windsurf, etc.)
npx skills add https://spoken.md

# OpenAPI spec — works with any agent framework
https://spoken.md/.well-known/openapi.json

# llms.txt — automatic LLM discovery
https://spoken.md/llms.txt

Pricing

Starter
$15 / 100

$0.15 per transcript

Volume
$160 / 2,000

$0.08 per transcript save 47%

For scripts & back-catalogues

Credits never expire. Errors are never charged. If it doesn't work, you don't pay.

Each transcript replaces ~$1 of transcription infrastructure — Deepgram runs $0.46/hr for audio of similar length, before adding speaker identification. No subscription, no overage charges, no annual commitment.

Already have a key? Top up here — or your agent can do it automatically via the API.

FAQ

What podcasts are supported?

spoken.md works with any podcast episode. Search by text or paste a URL from Spotify, YouTube, or any podcast app to find episodes.

How are speaker names detected?

Speaker names are detected automatically by analyzing the transcript for name mentions — no manual lookup table or post-processing required. When real names cannot be determined from context, labels like "Host" or "Guest" are used as fallbacks.

What format does the transcript come in?

Transcripts are returned as clean Markdown with speaker names in bold and timestamps per turn. A typical one-hour podcast episode produces 8,000–15,000 tokens — sized to fit in most LLM context windows in a single call. No proprietary markup, no timing artifacts, no post-processing needed.

Can I use this with my AI agent?

Yes. Install the agent skill with npx skills add https://spoken.md, use the OpenAPI spec for any agent framework, or make a plain HTTP call — anything that can hit an API works.

Is there a free trial?

Yes. Use the demo key pt_demo with any endpoint — no signup or payment needed. The demo key returns a full transcript for a sample episode so you can evaluate the format and quality before purchasing.

How do I get more transcripts?

Paste your API key in the top-up form above, or let your agent handle it — the API response includes a top-up link when credits run out.

How much does it cost?

Transcripts start at $0.15 each (100-pack for $15), with volume discounts: 500 for $50 ($0.10 each) or 2,000 for $160 ($0.08 each). Returning customers get lower top-up rates. Errors (404, 502) are never charged. No subscription — credits never expire.

Popular podcasts

Browse transcripts from popular shows:

Example transcripts by podcast →

From $0.08/transcript. No subscription, no expiry. Try with pt_demo first.