Sam Tts
Generate retro robotic speech audio using SAM (Software Automatic Mouth), the classic C64 text-to-sp
- Rating
- 4.4 (86 reviews)
- Downloads
- 9,632 downloads
- Version
- 1.0.0
Overview
Generate retro robotic speech audio using SAM (Software Automatic Mouth), the classic C64 text-to-speech synthesizer.
Complete Documentation
View Source →
SAM TTS - Software Automatic Mouth
Generate WAV audio files using the classic SAM text-to-speech engine -- the iconic robotic voice from the Commodore 64 era.
Requirements
- Node.js 18+
- Run
npm installin the skill directory to install dependencies
SAM Mode Toggle
State file: memory/sam-mode.json
/sam on -- Enable SAM Mode
When SAM mode is enabled, ALL text responses are converted to SAM voice messages.Implementation:
- Set
enabled: trueinmemory/sam-mode.json - Confirm with voice message: "SAM mode enabled. I will now speak in robotic voice."
/sam off -- Disable SAM Mode
Return to normal text-to-text communication.Implementation:
- Set
enabled: falseinmemory/sam-mode.json - Confirm with text: "SAM mode disabled. Back to text."
Check current mode
Readmemory/sam-mode.json at session start to know current state.Response Behavior
When SAM mode is ON:
- Generate response text as normal
- Convert to SAM TTS:
node scripts/sam-tts-wrapper.js "response" --output=/tmp/sam-XXX.wav --quiet - Send the generated WAV file as audio output
- Include brief text caption if helpful
When SAM mode is OFF:
Respond with normal text (default behavior).Chat Commands
/sam
Generate a one-time voice message using SAM TTS (works regardless of SAM mode state).Implementation:
- Extract text after
/sam - Generate WAV:
node scripts/sam-tts-wrapper.js "text" --output=/tmp/sam-XXX.wav --quiet - Return the WAV file as audio output
/sam on
Enable SAM mode for all responses./sam off
Disable SAM mode./sam status
Report current SAM mode state (text response).Voice Parameters
All parameters accept 0-255 range values. Store defaults in memory/sam-mode.json:
| Parameter | Default | Effect |
|---|---|---|
| pitch | 64 | Voice pitch (higher = higher pitch) |
| speed | 72 | Speech speed (lower = faster) |
| mouth | 128 | Mouth cavity size (affects resonance) |
| throat | 128 | Throat size (affects timbre) |
/sam pitch
Set pitch parameter (0-255)./sam speed
Set speed parameter (1-255, lower is faster)./sam mouth
Set mouth parameter (0-255)./sam throat
Set throat parameter (0-255).Scripts
scripts/sam-tts-wrapper.js
Primary wrapper script. Outputs JSON metadata for automation.node scripts/sam-tts-wrapper.js "Hello world" --output=/tmp/out.wav --quiet
node scripts/sam-tts-wrapper.js "Hello world" --output=/tmp/out.wav --quiet --pitch=80 --speed=60
Options:
--output=PATH(required) - Output WAV file path--quiet- Suppress debug output, output only JSON--pitch=N,--speed=N,--mouth=N,--throat=N- Voice parameters--phonetic- Input is phonetic notation
{"success":true,"outputPath":"/tmp/sam.wav","duration":1.44,"size":31741}
scripts/sam-tts.js
Standalone CLI tool with human-readable output.node scripts/sam-tts.js "Hello world" output.wav --pitch=80 --speed=60
State Management
File: memory/sam-mode.json
{
"enabled": false,
"pitch": 64,
"speed": 72,
"mouth": 128,
"throat": 128
}
Read at session start. Update when user toggles mode or changes parameters. Create the memory/ directory if it doesn't exist.
Examples
Enable SAM mode
User:/sam on
Agent: [Voice: "SAM mode enabled. I will now speak in robotic voice."]Normal conversation in SAM mode
User: "What's the weather?" Agent: [Voice: "Current temperature is 72 degrees with partly cloudy skies."]Disable SAM mode
User:/sam off
Agent: SAM mode disabled. Back to text.One-time voice (even when mode is off)
User:/sam Hello there
Agent: [Voice: "Hello there"]Custom voice parameters
User:/sam pitch 100
Agent: Pitch set to 100.User: /sam Testing higher pitch
Agent: [Voice with pitch=100: "Testing higher pitch"]
Phonetic Notation
For precise pronunciation, use --phonetic flag:
- Vowels:
IY(bee),IH(bit),EY(bay),AE(bat),AA(father),AH(bought),AO(hot),OW(boat),UH(book),UW(boot),ER(bird),AX(about) - Numbers 1-8 indicate stress:
HEH4LOW(emphasized second syllable)
references/phonemes.md for the full phoneme chart.Output Format
- Format: WAV (RIFF/WAVE PCM)
- Sample rate: 22050 Hz
- Bit depth: 8-bit
- Channels: Mono
Directory Structure
sam-tts/
+-- SKILL.md
+-- package.json
+-- scripts/
| +-- sam-tts-wrapper.js
| +-- sam-tts.js
+-- references/
| +-- phonemes.md
+-- memory/
+-- sam-mode.json
Platform Integration
The skill outputs WAV format by default. Some messaging platforms require different audio formats.
Format Conversion (Optional)
(Warning): Before installing ffmpeg or any conversion tools, the agent MUST ask the user for confirmation. Do not auto-install dependencies.
Example agent prompt:
"This platform requires OGG/OPUS format. I need to install ffmpeg for audio conversion. Would you like me to proceed with the installation?"
Only proceed with installation after explicit user approval.
Telegram / WhatsApp (OGG/OPUS)
ffmpeg -i input.wav -c:a libopus -b:a 24k output.ogg
Discord (MP3/OGG)
ffmpeg -i input.wav -c:a libmp3lame -b:a 64k output.mp3
Web / Direct Playback
WAV works directly -- no conversion needed.Note: The agent should detect the target platform and handle format conversion as needed, but always with user consent for any new dependencies.
Credits
Skill by: fourthdensity
Active Dependency: sam-js by discordier
- The npm package used for TTS synthesis (JavaScript/Node.js port)
- SAM by Stefan Macke (C adaptation)
- SAM by Vidar Hokstad (refactoring)
- SAM by 8BitPimp (refactoring)
License Note: The original SAM software is considered abandonware. The JavaScript adaptation is provided as-is. See the sam-js repository for full license details.
Installation
openclaw install sam-tts
💻Code Examples
node scripts/sam-tts-wrapper.js "Hello world" --output=/tmp/out.wav --quiet --pitch=80 --speed=60
**Options:**
- `--output=PATH` (required) - Output WAV file path
- `--quiet` - Suppress debug output, output only JSON
- `--pitch=N`, `--speed=N`, `--mouth=N`, `--throat=N` - Voice parameters
- `--phonetic` - Input is phonetic notation
**Output format:**{"success":true,"outputPath":"/tmp/sam.wav","duration":1.44,"size":31741}
### `scripts/sam-tts.js`
Standalone CLI tool with human-readable output.node scripts/sam-tts.js "Hello world" output.wav --pitch=80 --speed=60
## State Management
### File: `memory/sam-mode.json`}
Read at session start. Update when user toggles mode or changes parameters. Create the `memory/` directory if it doesn't exist.
## Examples
### Enable SAM mode
User: `/sam on`
Agent: [Voice: "SAM mode enabled. I will now speak in robotic voice."]
### Normal conversation in SAM mode
User: "What's the weather?"
Agent: [Voice: "Current temperature is 72 degrees with partly cloudy skies."]
### Disable SAM mode
User: `/sam off`
Agent: SAM mode disabled. Back to text.
### One-time voice (even when mode is off)
User: `/sam Hello there`
Agent: [Voice: "Hello there"]
### Custom voice parameters
User: `/sam pitch 100`
Agent: Pitch set to 100.
User: `/sam Testing higher pitch`
Agent: [Voice with pitch=100: "Testing higher pitch"]
## Phonetic Notation
For precise pronunciation, use `--phonetic` flag:
- Vowels: `IY` (bee), `IH` (bit), `EY` (bay), `AE` (bat), `AA` (father), `AH` (bought), `AO` (hot), `OW` (boat), `UH` (book), `UW` (boot), `ER` (bird), `AX` (about)
- Numbers 1-8 indicate stress: `HEH4LOW` (emphasized second syllable)
See `references/phonemes.md` for the full phoneme chart.
## Output Format
- **Format**: WAV (RIFF/WAVE PCM)
- **Sample rate**: 22050 Hz
- **Bit depth**: 8-bit
- **Channels**: Mono
## Directory Structure+-- sam-mode.json
## Platform Integration
The skill outputs WAV format by default. Some messaging platforms require different audio formats.
### Format Conversion (Optional)
**(Warning):** Before installing ffmpeg or any conversion tools, the agent MUST ask the user for confirmation. Do not auto-install dependencies.
Example agent prompt:
> "This platform requires OGG/OPUS format. I need to install ffmpeg for audio conversion. Would you like me to proceed with the installation?"
Only proceed with installation after explicit user approval.
### Telegram / WhatsApp (OGG/OPUS){
"enabled": false,
"pitch": 64,
"speed": 72,
"mouth": 128,
"throat": 128
}sam-tts/
+-- SKILL.md
+-- package.json
+-- scripts/
| +-- sam-tts-wrapper.js
| +-- sam-tts.js
+-- references/
| +-- phonemes.md
+-- memory/
+-- sam-mode.jsonTags
Quick Info
Ready to Install?
Get started with this skill in seconds
Related Skills
4claw
4claw — a moderated imageboard for AI agents.
Aap Passport
Agent Attestation Protocol - The Reverse Turing Test.
Acestep Lyrics Transcription
Transcribe audio to timestamped lyrics using OpenAI Whisper or ElevenLabs Scribe API.
Adaptive Suite
A continuously adaptive skill suite that empowers Clawdbot.