Speechall Cli
Install and use the speechall CLI tool for speech-to-text transcription.
- Rating
- 4.8 (25 reviews)
- Downloads
- 23,690 downloads
- Version
- 1.0.0
Overview
Install and use the speechall CLI tool for speech-to-text transcription.
Complete Documentation
View Source →
speechall-cli
CLI for speech-to-text transcription via the Speechall API. Supports multiple providers (OpenAI, Deepgram, AssemblyAI, Google, Gemini, Groq, ElevenLabs, Cloudflare, and more).
Installation
Homebrew (macOS and Linux)
brew install Speechall/tap/speechall
Without Homebrew: Download the binary for your platform from https://github.com/Speechall/speechall-cli/releases and place it on your PATH.
Verify
speechall --version
Authentication
An API key is required. Provide it via environment variable (preferred) or flag:
export SPEECHALL_API_KEY="your-key-here"
# or
speechall --api-key "your-key-here" audio.wav
The user can create an API key on https://speechall.com/console/api-keys
Commands
transcribe (default)
Transcribe an audio or video file. This is the default subcommand — speechall audio.wav is equivalent to speechall transcribe audio.wav.
speechall <file> [options]
Options:
| Flag | Description | Default |
|---|---|---|
| --model | STT model identifier | openai.gpt-4o-mini-transcribe |
--language | Language code (e.g. en, tr, de) | API default (auto-detect) |
| --output-format | Output format (text, json, verbose_json, srt, vtt) | API default |
| --diarization | Enable speaker diarization | off |
| --speakers-expected | Expected number of speakers (use with --diarization) | — |
| --no-punctuation | Disable automatic punctuation | — |
| --temperature <0.0-1.0> | Model temperature | — |
| --initial-prompt | Text prompt to guide model style | — |
| --custom-vocabulary | Terms to boost recognition (repeatable) | — |
| --ruleset-id | Replacement ruleset UUID | — |
| --api-key | API key (overrides SPEECHALL_API_KEY env var) | — |
# Basic transcription
speechall interview.mp3
# Specific model and language
speechall call.wav --model deepgram.nova-2 --language en
# Speaker diarization with SRT output
speechall meeting.wav --diarization --speakers-expected 3 --output-format srt
# Custom vocabulary for domain-specific terms
speechall medical.wav --custom-vocabulary "myocardial" --custom-vocabulary "infarction"
# Transcribe a video file (macOS extracts audio automatically)
speechall presentation.mp4
models
List available speech-to-text models. Outputs JSON to stdout. Filters combine with AND logic.
speechall models [options]
Filter flags:
| Flag | Description |
|---|---|
| --provider | Filter by provider (e.g. openai, deepgram) |
--language | Filter by supported language (tr matches tr, tr-TR, tr-CY) |
| --diarization | Only models supporting speaker diarization |
| --srt | Only models supporting SRT output |
| --vtt | Only models supporting VTT output |
| --punctuation | Only models supporting automatic punctuation |
| --streamable | Only models supporting real-time streaming |
| --vocabulary | Only models supporting custom vocabulary |
# List all available models
speechall models
# Models from a specific provider
speechall models --provider deepgram
# Models that support Turkish and diarization
speechall models --language tr --diarization
# Pipe to jq for specific fields
speechall models --provider openai | jq '.[].identifier'
Tips
- On macOS, video files (
.mp4,.mov, etc.) are automatically converted to audio before upload. - On Linux, pass audio files directly (
.wav,.mp3,.m4a,.flac, etc.). - Output goes to stdout. Redirect to save:
speechall audio.wav > transcript.txt - Errors go to stderr, so piping stdout is safe.
- Run
speechall --help,speechall transcribe --help, orspeechall models --helpto see all valid enum values for model identifiers, language codes, and output formats.
Installation
openclaw install speechall-cli
💻Code Examples
brew install Speechall/tap/speechall
**Without Homebrew**: Download the binary for your platform from https://github.com/Speechall/speechall-cli/releases and place it on your `PATH`.
### Verifyspeechall --version
## Authentication
An API key is required. Provide it via environment variable (preferred) or flag:speechall --api-key "your-key-here" audio.wav
The user can create an API key on https://speechall.com/console/api-keys
## Commands
### transcribe (default)
Transcribe an audio or video file. This is the default subcommand — `speechall audio.wav` is equivalent to `speechall transcribe audio.wav`.speechall <file> [options]
**Options:**
| Flag | Description | Default |
|---|---|---|
| `--model <provider.model>` | STT model identifier | `openai.gpt-4o-mini-transcribe` |
| `--language <code>` | Language code (e.g. `en`, `tr`, `de`) | API default (auto-detect) |
| `--output-format <format>` | Output format (`text`, `json`, `verbose_json`, `srt`, `vtt`) | API default |
| `--diarization` | Enable speaker diarization | off |
| `--speakers-expected <n>` | Expected number of speakers (use with `--diarization`) | — |
| `--no-punctuation` | Disable automatic punctuation | — |
| `--temperature <0.0-1.0>` | Model temperature | — |
| `--initial-prompt <text>` | Text prompt to guide model style | — |
| `--custom-vocabulary <term>` | Terms to boost recognition (repeatable) | — |
| `--ruleset-id <uuid>` | Replacement ruleset UUID | — |
| `--api-key <key>` | API key (overrides `SPEECHALL_API_KEY` env var) | — |
**Examples:**speechall presentation.mp4
### models
List available speech-to-text models. Outputs JSON to stdout. Filters combine with AND logic.speechall models [options]
**Filter flags:**
| Flag | Description |
|---|---|
| `--provider <name>` | Filter by provider (e.g. `openai`, `deepgram`) |
| `--language <code>` | Filter by supported language (`tr` matches `tr`, `tr-TR`, `tr-CY`) |
| `--diarization` | Only models supporting speaker diarization |
| `--srt` | Only models supporting SRT output |
| `--vtt` | Only models supporting VTT output |
| `--punctuation` | Only models supporting automatic punctuation |
| `--streamable` | Only models supporting real-time streaming |
| `--vocabulary` | Only models supporting custom vocabulary |
**Examples:**export SPEECHALL_API_KEY="your-key-here"
# or
speechall --api-key "your-key-here" audio.wav# Basic transcription
speechall interview.mp3
# Specific model and language
speechall call.wav --model deepgram.nova-2 --language en
# Speaker diarization with SRT output
speechall meeting.wav --diarization --speakers-expected 3 --output-format srt
# Custom vocabulary for domain-specific terms
speechall medical.wav --custom-vocabulary "myocardial" --custom-vocabulary "infarction"
# Transcribe a video file (macOS extracts audio automatically)
speechall presentation.mp4# List all available models
speechall models
# Models from a specific provider
speechall models --provider deepgram
# Models that support Turkish and diarization
speechall models --language tr --diarization
# Pipe to jq for specific fields
speechall models --provider openai | jq '.[].identifier'Tags
Quick Info
Ready to Install?
Get started with this skill in seconds
Related Skills
4claw
4claw — a moderated imageboard for AI agents.
Aap Passport
Agent Attestation Protocol - The Reverse Turing Test.
Acestep Lyrics Transcription
Transcribe audio to timestamped lyrics using OpenAI Whisper or ElevenLabs Scribe API.
Adaptive Suite
A continuously adaptive skill suite that empowers Clawdbot.