Build a podcast summarizer in 20 lines of code

Q: Can I summarize a whole podcast back-catalogue?

Yes. Use /search with the podcast name to enumerate episodes, then loop through and fetch each. Repeat fetches are free, so re-running with a different prompt later doesn't cost more.

Q: What format are the timestamps in?

Per speaker turn, in (H:MM) or (H:MM:SS) form. The LLM can quote them directly when citing specific moments.

Q: Is there a free way to try this?

Yes. Use the demo key pt_demo with any endpoint — no signup needed. The demo key returns a full transcript so you can run the summarizer end-to-end before purchasing credits.

Fetch any episode as Markdown, drop it into an LLM, get a summary back. The whole project is two API calls and a prompt.

Last updated June 2026

The hard part of building a podcast summarizer used to be getting a clean transcript with speaker names. With Spoken, that step is one API call. You search by topic or paste a URL, fetch the transcript as Markdown, and pass it to your LLM of choice with a summary prompt. A typical one-hour episode fits in one context window — no chunking, no diarization, no audio handling.

The complete summarizer

import requests
from anthropic import Anthropic

API_KEY = "YOUR_SPOKEN_KEY"
client = Anthropic()

def summarize_podcast(query):
    # 1. Find the episode
    r = requests.get(f"https://spoken.md/search?q={query}",
                     headers={"x-api-key": API_KEY})
    results = r.json()["results"]
    if not results:
        return "No matching episodes found."
    episode = results[0]

    # 2. Fetch the transcript
    transcript = requests.get(
        f"https://spoken.md/transcripts/{episode['id']}",
        headers={"x-api-key": API_KEY},
    ).text

    # 3. Summarize
    msg = client.messages.create(
        model="claude-haiku-4-5-20251001",
        max_tokens=1024,
        messages=[{
            "role": "user",
            "content": f"Summarize this podcast episode in 5 bullet points, "
                       f"attributing key points to specific speakers:\n\n{transcript}"
        }],
    )
    return f"# {episode['title']}\n{episode['podcast']}\n\n{msg.content[0].text}"

print(summarize_podcast("huberman sleep"))

That's the whole thing. Real speaker names land in the prompt as **Andrew Huberman** (0:00), so the LLM can attribute claims correctly without prompt engineering tricks.

What you'd otherwise be building

The same summarizer without Spoken needs:

A podcast search index (or a separate API for episode metadata)
An audio download step (locate .mp3, fetch 50–100 MB)
A transcription service (Whisper API or self-hosted)
A diarization step (Pyannote, or use a service like AssemblyAI that bundles it)
A speaker-naming pass (the diarization output is anonymous)
Storage and caching so you don't re-transcribe the same episode twice

For a side project, that's a weekend of plumbing. For a production product, it's ongoing infrastructure to maintain.

Output examples

Bullet-point summary

Prompt: "Summarize this podcast in 5 bullets, attributing claims to speakers."

Output:
• **Andrew Huberman** opens by framing the episode around the
  neuroscience of sleep architecture and its impact on next-day cognition.
• **Matt Walker** explains that REM and deep sleep play distinct roles —
  deep sleep consolidates declarative memory; REM consolidates procedural.
• They discuss the 90-minute ultradian cycle and why waking inside a
  cycle (vs at the end of one) produces worse subjective grogginess.
• Walker recommends consistency of sleep timing over total hours as
  the single most impactful intervention for most adults.
• Closing segment covers caffeine's 5–6 hour half-life and Walker's
  suggestion to cut off intake by 2 PM.

Other useful prompts on top of the same Markdown

Show notes: "Extract a list of books, papers, and people mentioned, with timestamps."
Quote extraction: "Pull 5 quotable lines suitable for social media, attributed to each speaker."
Disagreement detection: "Identify points where the speakers disagreed and what each argued."
Topical chapters: "Break the episode into chapters with timestamp ranges and one-line descriptions."
Q&A extraction: "List every question one speaker asked the other, and the substance of the answer."

Cost per summary

A one-hour podcast is roughly 8,000–15,000 tokens. With Spoken at the 500-pack rate and Claude Haiku as the summarizer:

Transcript fetch: $0.10 (Spoken)
Summary generation: ~$0.01–$0.02 (Claude Haiku, depending on output length)
Total per episode: ~$0.11–$0.12

At the 2,000-pack rate the transcript fetch drops to $0.08, so the all-in cost is around $0.09–$0.10 per episode summarized.

FAQ

Does the transcript fit in one LLM context window?

Yes for almost every episode. A one-hour podcast is 8,000–15,000 tokens. Models like Claude Sonnet/Opus (200K context), GPT-4o (128K context), and Gemini (1M context) all fit a full episode comfortably with room for prompt and output.

Which LLM should I use?

For straight summarization, Claude Haiku and GPT-4o-mini both produce strong results at low cost. For nuanced extraction or multi-step reasoning, step up to Claude Sonnet or GPT-4o.

Can I summarize a whole podcast back-catalogue?

Yes. Use /search with the podcast name to enumerate episodes, then loop through and fetch each. Repeat fetches are free, so re-running with a different prompt later doesn't cost more.

What format are the timestamps in?

Per speaker turn, in (H:MM) or (H:MM:SS) form. The LLM can quote them directly when citing specific moments.

Is there a free way to try this?

Yes. Use the demo key pt_demo with any endpoint — no signup needed. The demo key returns a full transcript so you can run the summarizer end-to-end before purchasing credits.

TL;DR: A working podcast summarizer is now a two-API-call project. Spoken returns clean Markdown with real speaker names; pass it to any LLM with a summary prompt. ~$0.10 per episode end-to-end.

Try the summarizer with no signup — use API key pt_demo on any endpoint.

$0.10 per transcript. Credits never expire.