version: "1.0.1"
name: openai-whisper-api description: Transcribe audio via OpenAI Audio Transcriptions API (Whisper). homepage: https://platform.openai.com/docs/guides/speech-to-text metadata: { "kabot": { "emoji": "â˜ï¸", "requires": { "bins": ["curl"], "env": ["OPENAI_API_KEY"] }, "primaryEnv": "OPENAI_API_KEY", }, }
OpenAI Whisper API (curl)
Transcribe an audio file via OpenAI’s /v1/audio/transcriptions endpoint.
Quick start
bash
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a
Defaults:
- Model:
whisper-1 - Output:
<input>.txt
Useful flags
bash
{baseDir}/scripts/transcribe.sh /path/to/audio.ogg --model whisper-1 --out /tmp/transcript.txt
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --language en
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --prompt "Speaker names: Peter, Daniel"
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --json --out /tmp/transcript.json
API key
Set OPENAI_API_KEY, or configure it in ~/.kabot/kabot.json:
json5
{
skills: {
"openai-whisper-api": {
apiKey: "OPENAI_KEY_HERE",
},
},
}