Technical & DevelopmentIntermediate

transcribe

Transcribe audio files to text with optional speaker diarization

Developer Setup

Setup & Installation

bash

npx skills add https://github.com/openai/skills --skill transcribe

npx skills add https://github.com/openai/skills --skill transcribe

Or paste this URL into your assistant to install:

Overview

Transcribe audio files to text with optional speaker diarization

Application

Integrating transcribe into your development workflow.
Following best practices for transcribe audio files to text with optional speaker diarization.
Automating repetitive tasks with AI-assisted tooling.
Building production-grade applications with proper standards.
Debugging and troubleshooting common implementation issues.

Documentation

Transcribe audio using OpenAI, with optional speaker diarization when requested. Prefer the bundled CLI for deterministic, repeatable runs.

Collect inputs: audio file path(s), desired response format (text/json/diarized_json), optional language hint, and any known speaker references.
Verify OPENAI_API_KEY is set. If missing, ask the user to set it locally (do not ask them to paste the key).
Run the bundled transcribe_diarize.py CLI with sensible defaults (fast text transcription).
Validate the output: transcription quality, speaker labels, and segment boundaries; iterate with a single targeted change if needed.
Save outputs under output/transcribe/ when working in this repo.

Default to gpt-4o-mini-transcribe with --response-format text for fast transcription.
If the user wants speaker labels or diarization, use --model gpt-4o-transcribe-diarize --response-format diarized_json.
If audio is longer than ~30 seconds, keep --chunking-strategy auto.
Prompting is not supported for gpt-4o-transcribe-diarize.

Prefer uv for dependency management.

uv pip install openai

Lines 1 - 27 of 75

Recommendations

Backend architecture with REST API design, auth flows, real-time features, and database integration

Android native development with Kotlin/Jetpack Compose, Material Design 3, and accessibility

iOS development with UIKit, SnapKit, and SwiftUI covering navigation, Dark Mode, and HIG compliance