Technical & DevelopmentIntermediate

podcast-generation

AI podcast audio with Azure OpenAI Realtime API

Developer Setup

Setup & Installation

bash

npx skills add https://github.com/microsoft/skills --skill podcast-generation

npx skills add https://github.com/microsoft/skills --skill podcast-generation

Or paste this URL into your assistant to install:

https://github.com/microsoft/skills/tree/main/.github/plugins/azure-skills/skills/podcast-generation View on GitHub

Overview

What This Skill Does

Connects a React frontend to a Python FastAPI backend over WebSocket to generate spoken audio from text using Azure OpenAI's GPT Realtime Mini model. Takes a text prompt, streams PCM audio chunks, converts them to WAV, and returns base64-encoded audio for browser playback. Includes transcript output alongside the audio.

Application

When to use this Skill

Configuring integration settings for custom agent workflows.
Optimizing query execution and response latency in production.
Developing clean, standard-compliant implementations for enterprise services.
Troubleshooting connection timeouts and authentication handshakes.
Monitoring API rate limits and execution pipelines programmatically.

Documentation

Show Skills.md file

Podcast Generation with GPT Realtime Mini

Generate real audio narratives from text content using Azure OpenAI's Realtime API.

Quick Start

Configure environment variables for Realtime API
Connect via WebSocket to Azure OpenAI Realtime endpoint
Send text prompt, collect PCM audio chunks + transcript
Convert PCM to WAV format
Return base64-encoded audio to frontend for playback

Environment Configuration

AZURE_OPENAI_AUDIO_API_KEY=your_realtime_api_key
AZURE_OPENAI_AUDIO_ENDPOINT=https://your-resource.cognitiveservices.azure.com
AZURE_OPENAI_AUDIO_DEPLOYMENT=gpt-realtime-mini

Note: Endpoint should NOT include /openai/v1/ - just the base URL.

Core Workflow

Backend Audio Generation

Lines 1 - 25 of 116

Recommendations

Explore other random skills

TechnicalIntermediate

mongodb-connection

Optimize MongoDB client connection pools, timeouts, and serverless patterns

CreativeIntermediate

mongodb-schema-design

Design efficient document schemas with validation and indexing patterns

CreativeIntermediate

atlas-stream-processing

Build, operate, and debug Atlas Stream Processing pipelines with Kafka, S3, and Lambda integrations

All skills My patterns