✓ Verified 💻 Development ✓ Enhanced Data

Speechall Cli

Install and use the speechall CLI tool for speech-to-text transcription.

Rating: 4.8 (25 reviews)
Downloads: 23,690 downloads
Version: 1.0.0

Overview

Install and use the speechall CLI tool for speech-to-text transcription.

Complete Documentation

View Source →

speechall-cli

CLI for speech-to-text transcription via the Speechall API. Supports multiple providers (OpenAI, Deepgram, AssemblyAI, Google, Gemini, Groq, ElevenLabs, Cloudflare, and more).

Installation

Homebrew (macOS and Linux)

bash

brew install Speechall/tap/speechall

Without Homebrew: Download the binary for your platform from https://github.com/Speechall/speechall-cli/releases and place it on your PATH.

Verify

bash

speechall --version

Authentication

An API key is required. Provide it via environment variable (preferred) or flag:

bash

export SPEECHALL_API_KEY="your-key-here"
# or
speechall --api-key "your-key-here" audio.wav

The user can create an API key on https://speechall.com/console/api-keys

Commands

transcribe (default)

Transcribe an audio or video file. This is the default subcommand — speechall audio.wav is equivalent to speechall transcribe audio.wav.

bash

speechall <file> [options]

Options:

Flag	Description	Default
--model	STT model identifier	openai.gpt-4o-mini-transcribe
--language	Language code (e.g. en, tr, de)	API default (auto-detect)
--output-format	Output format (text, json, verbose_json, srt, vtt)	API default
--diarization	Enable speaker diarization	off
--speakers-expected	Expected number of speakers (use with --diarization)	—
--no-punctuation	Disable automatic punctuation	—
--temperature <0.0-1.0>	Model temperature	—
--initial-prompt	Text prompt to guide model style	—
--custom-vocabulary	Terms to boost recognition (repeatable)	—
--ruleset-id	Replacement ruleset UUID	—
--api-key	API key (overrides SPEECHALL_API_KEY env var)	—

Examples:

bash

# Basic transcription
speechall interview.mp3

# Specific model and language
speechall call.wav --model deepgram.nova-2 --language en

# Speaker diarization with SRT output
speechall meeting.wav --diarization --speakers-expected 3 --output-format srt

# Custom vocabulary for domain-specific terms
speechall medical.wav --custom-vocabulary "myocardial" --custom-vocabulary "infarction"

# Transcribe a video file (macOS extracts audio automatically)
speechall presentation.mp4

models

List available speech-to-text models. Outputs JSON to stdout. Filters combine with AND logic.

bash

speechall models [options]

Filter flags:

Flag	Description
--provider	Filter by provider (e.g. openai, deepgram)
--language	Filter by supported language (tr matches tr, tr-TR, tr-CY)
--diarization	Only models supporting speaker diarization
--srt	Only models supporting SRT output
--vtt	Only models supporting VTT output
--punctuation	Only models supporting automatic punctuation
--streamable	Only models supporting real-time streaming
--vocabulary	Only models supporting custom vocabulary

Examples:

bash

# List all available models
speechall models

# Models from a specific provider
speechall models --provider deepgram

# Models that support Turkish and diarization
speechall models --language tr --diarization

# Pipe to jq for specific fields
speechall models --provider openai | jq '.[].identifier'

Tips

On macOS, video files (.mp4, .mov, etc.) are automatically converted to audio before upload.
On Linux, pass audio files directly (.wav, .mp3, .m4a, .flac, etc.).
Output goes to stdout. Redirect to save: speechall audio.wav > transcript.txt
Errors go to stderr, so piping stdout is safe.
Run speechall --help, speechall transcribe --help, or speechall models --help to see all valid enum values for model identifiers, language codes, and output formats.

Installation

Terminal bash


openclaw install speechall-cli

Copied!

💻Code Examples

brew install Speechall/tap/speechall

brew-install-speechalltapspeechall.txt

**Without Homebrew**: Download the binary for your platform from https://github.com/Speechall/speechall-cli/releases and place it on your `PATH`.

### Verify

speechall --version

speechall---version.txt

## Authentication

An API key is required. Provide it via environment variable (preferred) or flag:

speechall --api-key "your-key-here" audio.wav

speechall---api-key-your-key-here-audiowav.txt

The user can create an API key on https://speechall.com/console/api-keys

## Commands

### transcribe (default)

Transcribe an audio or video file. This is the default subcommand — `speechall audio.wav` is equivalent to `speechall transcribe audio.wav`.

speechall <file> [options]

speechall-file-options.txt

**Options:**

| Flag | Description | Default |
|---|---|---|
| `--model <provider.model>` | STT model identifier | `openai.gpt-4o-mini-transcribe` |
| `--language <code>` | Language code (e.g. `en`, `tr`, `de`) | API default (auto-detect) |
| `--output-format <format>` | Output format (`text`, `json`, `verbose_json`, `srt`, `vtt`) | API default |
| `--diarization` | Enable speaker diarization | off |
| `--speakers-expected <n>` | Expected number of speakers (use with `--diarization`) | — |
| `--no-punctuation` | Disable automatic punctuation | — |
| `--temperature <0.0-1.0>` | Model temperature | — |
| `--initial-prompt <text>` | Text prompt to guide model style | — |
| `--custom-vocabulary <term>` | Terms to boost recognition (repeatable) | — |
| `--ruleset-id <uuid>` | Replacement ruleset UUID | — |
| `--api-key <key>` | API key (overrides `SPEECHALL_API_KEY` env var) | — |

**Examples:**

speechall presentation.mp4

speechall-presentationmp4.txt

### models

List available speech-to-text models. Outputs JSON to stdout. Filters combine with AND logic.

speechall models [options]

speechall-models-options.txt

**Filter flags:**

| Flag | Description |
|---|---|
| `--provider <name>` | Filter by provider (e.g. `openai`, `deepgram`) |
| `--language <code>` | Filter by supported language (`tr` matches `tr`, `tr-TR`, `tr-CY`) |
| `--diarization` | Only models supporting speaker diarization |
| `--srt` | Only models supporting SRT output |
| `--vtt` | Only models supporting VTT output |
| `--punctuation` | Only models supporting automatic punctuation |
| `--streamable` | Only models supporting real-time streaming |
| `--vocabulary` | Only models supporting custom vocabulary |

**Examples:**

example.sh

export SPEECHALL_API_KEY="your-key-here"
# or
speechall --api-key "your-key-here" audio.wav

example.sh

# Basic transcription
speechall interview.mp3

# Specific model and language
speechall call.wav --model deepgram.nova-2 --language en

# Speaker diarization with SRT output
speechall meeting.wav --diarization --speakers-expected 3 --output-format srt

# Custom vocabulary for domain-specific terms
speechall medical.wav --custom-vocabulary "myocardial" --custom-vocabulary "infarction"

# Transcribe a video file (macOS extracts audio automatically)
speechall presentation.mp4

example.sh

# List all available models
speechall models

# Models from a specific provider
speechall models --provider deepgram

# Models that support Turkish and diarization
speechall models --language tr --diarization

# Pipe to jq for specific fields
speechall models --provider openai | jq '.[].identifier'