podcast-generation
AI podcast audio with Azure OpenAI Realtime API
Developer Setup
Setup & Installation
npx skills add https://github.com/microsoft/skills --skill podcast-generationnpx skills add https://github.com/microsoft/skills --skill podcast-generationOverview
What This Skill Does
Connects a React frontend to a Python FastAPI backend over WebSocket to generate spoken audio from text using Azure OpenAI's GPT Realtime Mini model. Takes a text prompt, streams PCM audio chunks, converts them to WAV, and returns base64-encoded audio for browser playback. Includes transcript output alongside the audio.
Application
When to use this Skill
- Configuring integration settings for custom agent workflows.
- Optimizing query execution and response latency in production.
- Developing clean, standard-compliant implementations for enterprise services.
- Troubleshooting connection timeouts and authentication handshakes.
- Monitoring API rate limits and execution pipelines programmatically.
Documentation
Show Skills.md file
Podcast Generation with GPT Realtime Mini
Generate real audio narratives from text content using Azure OpenAI's Realtime API.
Quick Start
- Configure environment variables for Realtime API
- Connect via WebSocket to Azure OpenAI Realtime endpoint
- Send text prompt, collect PCM audio chunks + transcript
- Convert PCM to WAV format
- Return base64-encoded audio to frontend for playback
Environment Configuration
AZURE_OPENAI_AUDIO_API_KEY=your_realtime_api_key
AZURE_OPENAI_AUDIO_ENDPOINT=https://your-resource.cognitiveservices.azure.com
AZURE_OPENAI_AUDIO_DEPLOYMENT=gpt-realtime-mini
Note: Endpoint should NOT include /openai/v1/ - just the base URL.
Core Workflow
Backend Audio Generation
Recommendations
Explore other random skills
mongodb-connection
Optimize MongoDB client connection pools, timeouts, and serverless patterns
mongodb-schema-design
Design efficient document schemas with validation and indexing patterns
atlas-stream-processing
Build, operate, and debug Atlas Stream Processing pipelines with Kafka, S3, and Lambda integrations