Generation Module¶

The generation module handles AI-powered script generation and audio production.

Script Generation¶

Script generation using Anthropic Claude.

class the_data_packet.generation.script.ScriptGenerator(api_key: str | None = None)[source]¶

Bases: object

Generates podcast scripts from articles using Claude AI.

__init__(api_key: str | None = None)[source]¶

Initialize the script generator.

Parameters:: api_key – Anthropic API key (defaults to config)

generate_script(articles: List[Article]) → str[source]¶

Generate a complete podcast script from articles.

Parameters:

articles – List of articles to convert to script

Returns:

Complete podcast script

Raises:

AIGenerationError – If script generation fails
ValidationError – If no valid articles provided

Audio Generation¶

Audio generation using Google Cloud Text-to-Speech Long Audio Synthesis.

class the_data_packet.generation.audio.AudioResult(output_file: Path, duration_seconds: float | None = None, file_size_bytes: int | None = None, generation_time_seconds: float | None = None)[source]¶

Bases: object

Result of audio generation.

output_file: Path¶

duration_seconds: float | None = None¶

file_size_bytes: int | None = None¶

generation_time_seconds: float | None = None¶

__init__(output_file: Path, duration_seconds: float | None = None, file_size_bytes: int | None = None, generation_time_seconds: float | None = None) → None¶

class the_data_packet.generation.audio.AudioGenerator(credentials_path: str | None = None, male_voice: str | None = None, female_voice: str | None = None, gcs_bucket_name: str | None = None)[source]¶

Bases: object

Generates podcast audio from scripts using Google Cloud Text-to-Speech Long Audio Synthesis.

AVAILABLE_VOICES = {'female': ['en-US-Studio-O'], 'male': ['en-US-Studio-Q']}¶

AUDIO_CONFIG = {'audio_encoding': AudioEncoding.LINEAR16, 'effects_profile_id': ['telephony-class-application'], 'sample_rate_hertz': 44100}¶

__init__(credentials_path: str | None = None, male_voice: str | None = None, female_voice: str | None = None, gcs_bucket_name: str | None = None)[source]¶

Initialize the audio generator.

Parameters:

credentials_path – Path to Google Cloud service account JSON credentials
male_voice – Voice name for first speaker (Alex)
female_voice – Voice name for second speaker (Sam)
gcs_bucket_name – Google Cloud Storage bucket for audio output

generate_audio(script: str, output_file: Path | None = None) → AudioResult[source]¶: Generate audio from a podcast script, automatically handling chunking and mp3 output.

get_available_voices() → Dict[str, List[str]][source]¶: Get available Studio Multi-speaker voices for Google Cloud TTS.

test_authentication() → bool[source]¶: Test Google Cloud TTS authentication and basic functionality.

split_text_by_bytes(text: str, max_bytes: int = 4000) → List[str][source]¶: Split text into chunks under max_bytes, preserving words.

generate_audio_chunked(script: str, output_file: Path | None = None) → AudioResult[source]¶: Generate audio for long scripts by splitting into chunks and merging the results into a single mp3 file.

convert_wav_to_mp3(wav_path: Path, mp3_path: Path) → None[source]¶: Convert a wav file to mp3 using pydub.

RSS Generation¶

RSS feed generation and management for podcast episodes.

class the_data_packet.generation.rss.PodcastEpisode(title: str, description: str, audio_url: str, pub_date: datetime, episode_number: int | None = None, duration: str | None = None, file_size: int | None = None, guid: str | None = None, author: str | None = None)[source]¶

Bases: object

Represents a podcast episode for RSS feed.

title: str¶

description: str¶

audio_url: str¶

pub_date: datetime¶

episode_number: int | None = None¶

duration: str | None = None¶

file_size: int | None = None¶

guid: str | None = None¶

author: str | None = None¶

__post_init__() → None[source]¶: Generate GUID if not provided.

__init__(title: str, description: str, audio_url: str, pub_date: datetime, episode_number: int | None = None, duration: str | None = None, file_size: int | None = None, guid: str | None = None, author: str | None = None) → None¶

class the_data_packet.generation.rss.RSSGenerationResult(success: bool = False, rss_content: str | None = None, local_path: Path | None = None, s3_url: str | None = None, error_message: str | None = None)[source]¶

Bases: object

Result of RSS feed generation.

success: bool = False¶

rss_content: str | None = None¶

local_path: Path | None = None¶

s3_url: str | None = None¶

error_message: str | None = None¶

__init__(success: bool = False, rss_content: str | None = None, local_path: Path | None = None, s3_url: str | None = None, error_message: str | None = None) → None¶

class the_data_packet.generation.rss.RSSGenerator(config: Config | None = None)[source]¶

Bases: object

Generates and manages RSS feeds for podcast episodes.

__init__(config: Config | None = None) → None[source]¶: Initialize RSS generator.

s3_storage: S3Storage | None¶

generate_episode_from_articles(articles: List[Article], audio_url: str, episode_number: int | None = None, duration: str | None = None, file_size: int | None = None, existing_episodes: List[PodcastEpisode] | None = None) → PodcastEpisode[source]¶: Generate a podcast episode from articles.

generate_rss_feed(episodes: List[PodcastEpisode], channel_title: str | None = None, channel_description: str | None = None, channel_link: str | None = None, channel_image_url: str | None = None, channel_email: str | None = None) → str[source]¶: Generate complete RSS feed XML.

load_existing_feed(rss_content: str) → List[PodcastEpisode][source]¶: Parse existing RSS feed and extract episodes.

update_rss_feed(new_episode: PodcastEpisode) → RSSGenerationResult[source]¶: Update RSS feed with new episode.