Creative & DesignIntermediate
speech
Generate spoken audio from text using OpenAI's API with built-in vo...
Developer Setup
Setup & Installation
bash
npx skills add https://github.com/openai/skills --skill speechnpx skills add https://github.com/openai/skills --skill speechOr paste this URL into your assistant to install:
Overview
What This Skill Does
Generate spoken audio from text using OpenAI's API with built-in voices
Application
When to use this Skill
- Integrating speech into your development workflow.
- Following best practices for generate spoken audio from text using openai's api with built-in voices.
- Automating repetitive tasks with AI-assisted tooling.
- Building production-grade applications with proper standards.
- Debugging and troubleshooting common implementation issues.
Documentation
Show Skills.md file
Speech Generation Skill
Generate spoken audio for the current project (narration, product demo voiceover, IVR prompts, accessibility reads). Defaults to gpt-4o-mini-tts-2025-12-15 and built-in voices, and prefers the bundled CLI for deterministic, reproducible runs.
When to use
- Generate a single spoken clip from text
- Generate a batch of prompts (many lines, many files)
Decision tree (single vs batch)
- If the user provides multiple lines/prompts or wants many outputs -> batch
- Else -> single
Workflow
- Decide intent: single vs batch (see decision tree above).
- Collect inputs up front: exact text (verbatim), desired voice, delivery style, format, and any constraints.
- If batch: write a temporary JSONL under tmp/ (one job per line), run once, then delete the JSONL.
- Augment instructions into a short labeled spec without rewriting the input text.
- Run the bundled CLI (
scripts/text_to_speech.py) with sensible defaults (see references/cli.md). - For important clips, validate: intelligibility, pacing, pronunciation, and adherence to constraints.
- Iterate with a single targeted change (voice, speed, or instructions), then re-check.
- Save/return final outputs and note the final text + instructions + flags used.
Temp and output conventions
- Use
tmp/speech/for intermediate files (for example JSONL batches); delete when done. - Write final artifacts under
output/speech/when working in this repo.
Lines 1 - 25 of 138
Recommendations
Explore other random skills
CreativeIntermediate
figma-use
Prerequisite skill for every use_figma tool call — write/read actions in Figma context
CreativeBeginner
frontend-skill
Create visually strong landing pages, websites, and app UIs with restrained composition
CreativeIntermediate
playwright-interactive
Persistent browser and Electron interaction via js_repl for iterative UI debugging