Skip to content

Overview

The Data Packet is a production-ready, AI-powered podcast generator. It scrapes tech news, writes a two-host dialogue script using Claude, synthesizes professional audio with Google Cloud TTS, and uploads the finished episode to S3 — all from a single command.

Version: 2.0.0  |  License: MIT  |  Author: TheWinterShadow


What problem does it solve?

Creating a regular tech podcast manually involves:

  • Monitoring multiple news sources for relevant articles
  • Writing a coherent, natural-sounding script for two hosts
  • Recording, editing, and mixing audio
  • Generating an RSS feed and uploading everything for distribution

The Data Packet automates every step. It produces a complete, ready-to-publish episode — script, audio file, and RSS feed — without manual intervention.


The full pipeline

flowchart LR
    A["Article Sources\nWired · TechCrunch"] -->|RSS fetch + parse| B
    B["MongoDB Check\nDeduplication"] -->|fresh articles| C
    C["Script Generator\nAnthropic Claude"] -->|dialogue script| D
    D["Audio Generator\nGoogle Cloud TTS"] -->|synthesized audio| E
    E["RSS Generator"] -->|episode files| F
    F["S3 Upload\nAWS S3"]

Technology stack

  • Script generation


    Anthropic Claude writes the dialogue. Articles are fed as context; Claude produces a natural two-host conversation covering each story.

  • Audio synthesis


    Google Cloud TTS Long Audio Synthesis generates the final .wav. Studio voices at 44.1 kHz — no per-segment timeouts, no quality limits.

  • Intermediate storage


    Google Cloud Storage (GCS) holds synthesized audio during processing. A 30-day lifecycle policy keeps costs low automatically.

  • Podcast hosting


    AWS S3 stores the finished episode files with public read access, making them immediately distributable via any podcast app.

  • Episode tracking


    MongoDB (optional) tracks which articles have been used, preventing repetition across episodes.

  • Observability


    JSONL structured logging with optional S3 archival and Grafana Loki forwarding for centralised log aggregation.


What you get

After running the pipeline, your output/ directory contains:

output/
├── episode_script.txt    # Full two-host dialogue (Alex + Sam)
├── episode.wav           # Professional stereo audio, 44.1 kHz
└── feed.xml              # RSS 2.0 podcast feed (if S3 is configured)

If S3 is configured, all three files are uploaded and the public URLs are printed to the console — ready to add to Apple Podcasts, Spotify, or any podcast directory.


Key design decisions

Docker-first

The primary deployment target is Docker. A pre-built multi-platform image (amd64 + arm64) is published to GitHub Container Registry on every release. You don't need Python or any system dependencies installed locally.

Modular architecture

Each pipeline stage is an independent, swappable module. Sources, generators, and utilities follow consistent interfaces, making it straightforward to add new article sources or swap the audio backend.

Environment-variable configuration

All secrets and tunables are configured via environment variables. Nothing is hardcoded. A .env file or Docker -e flags are the primary configuration mechanism.

Retry-first reliability

Every external API call (Claude, Google Cloud TTS, S3) uses exponential backoff via tenacity. Transient failures are handled transparently without crashing the pipeline.

No ElevenLabs dependency (v2.0 migration)

Version 2.0 migrated entirely to Google Cloud TTS Long Audio Synthesis. This removes per-character pricing caps, eliminates timeout issues on long scripts, and provides Studio-quality voices at 44.1 kHz.


Next steps