Eachlabs Voice Audio
TTS, STT, voice conversion using ElevenLabs, Whisper, RVC.
- Rating
- 4.4 (361 reviews)
- Downloads
- 1,537 downloads
- Version
- 1.0.0
Overview
TTS, STT, voice conversion using ElevenLabs, Whisper, RVC.
Complete Documentation
View Source →
EachLabs Voice & Audio
Text-to-speech, speech-to-text transcription, voice conversion, and audio utilities via the EachLabs Predictions API.
Authentication
Header: X-API-Key: <your-api-key>
Set the EACHLABS_API_KEY environment variable. Get your key at eachlabs.ai.
Available Models
Text-to-Speech
| Model | Slug | Best For |
|---|---|---|
| ElevenLabs TTS | elevenlabs-text-to-speech | High quality TTS |
| ElevenLabs TTS w/ Timestamps | elevenlabs-text-to-speech-with-timestamp | TTS with word timing |
| ElevenLabs Text to Dialogue | elevenlabs-text-to-dialogue | Multi-speaker dialogue |
| ElevenLabs Sound Effects | elevenlabs-sound-effects | Sound effect generation |
| ElevenLabs Voice Design v2 | elevenlabs-voice-design-v2 | Custom voice design |
| Kling V1 TTS | kling-v1-tts | Kling text-to-speech |
| Kokoro 82M | kokoro-82m | Lightweight TTS |
| Play AI Dialog | play-ai-text-to-speech-dialog | Dialog TTS |
| Stable Audio 2.5 | stable-audio-2-5-text-to-audio | Text to audio |
Speech-to-Text
| Model | Slug | Best For |
|---|---|---|
| ElevenLabs Scribe v2 | elevenlabs-speech-to-text-scribe-v2 | Best quality transcription |
| ElevenLabs STT | elevenlabs-speech-to-text | Standard transcription |
| Wizper with Timestamp | wizper-with-timestamp | Timestamped transcription |
| Wizper | wizper | Basic transcription |
| Whisper | whisper | Open-source transcription |
| Whisper Diarization | whisper-diarization | Speaker identification |
| Incredibly Fast Whisper | incredibly-fast-whisper | Fastest transcription |
Voice Conversion & Cloning
| Model | Slug | Best For |
|---|---|---|
| RVC v2 | rvc-v2 | Voice conversion |
| Train RVC | train-rvc | Train custom voice model |
| ElevenLabs Voice Clone | elevenlabs-voice-clone | Voice cloning |
| ElevenLabs Voice Changer | elevenlabs-voice-changer | Voice transformation |
| ElevenLabs Voice Design v3 | elevenlabs-voice-design-v3 | Advanced voice design |
| ElevenLabs Dubbing | elevenlabs-dubbing | Video dubbing |
| Chatterbox S2S | chatterbox-speech-to-speech | Speech to speech |
| Open Voice | openvoice | Open-source voice clone |
| XTTS v2 | xtts-v2 | Multi-language voice clone |
| Stable Audio 2.5 Inpaint | stable-audio-2-5-inpaint | Audio inpainting |
| Stable Audio 2.5 A2A | stable-audio-2-5-audio-to-audio | Audio transformation |
| Audio Trimmer | audio-trimmer-with-fade | Audio trimming with fade |
Audio Utilities
| Model | Slug | Best For |
|---|---|---|
| FFmpeg Merge Audio Video | ffmpeg-api-merge-audio-video | Merge audio with video |
| Toolkit Video Convert | toolkit | Video/audio conversion |
Prediction Flow
- Check model
GET https://api.eachlabs.ai/v1/model?slug=— validates the model exists and returns therequest_schemawith exact input parameters. Always do this before creating a prediction to ensure correct inputs. - POST
https://api.eachlabs.ai/v1/predictionwith model slug, version"0.0.1", and input matching the schema - Poll
GET https://api.eachlabs.ai/v1/prediction/{id}until status is"success"or"failed" - Extract the output from the response
Examples
Text-to-Speech with ElevenLabs
curl -X POST https://api.eachlabs.ai/v1/prediction \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-d '{
"model": "elevenlabs-text-to-speech",
"version": "0.0.1",
"input": {
"text": "Welcome to our product demo. Today we will walk through the key features.",
"voice_id": "EXAVITQu4vr4xnSDxMaL",
"model_id": "eleven_v3",
"stability": 0.5,
"similarity_boost": 0.7
}
}'
Transcription with ElevenLabs Scribe
curl -X POST https://api.eachlabs.ai/v1/prediction \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-d '{
"model": "elevenlabs-speech-to-text-scribe-v2",
"version": "0.0.1",
"input": {
"media_url": "https://example.com/recording.mp3",
"diarize": true,
"timestamps_granularity": "word"
}
}'
Transcription with Wizper (Whisper)
curl -X POST https://api.eachlabs.ai/v1/prediction \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-d '{
"model": "wizper-with-timestamp",
"version": "0.0.1",
"input": {
"audio_url": "https://example.com/audio.mp3",
"language": "en",
"task": "transcribe",
"chunk_level": "segment"
}
}'
Speaker Diarization with Whisper
curl -X POST https://api.eachlabs.ai/v1/prediction \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-d '{
"model": "whisper-diarization",
"version": "0.0.1",
"input": {
"file_url": "https://example.com/meeting.mp3",
"num_speakers": 3,
"language": "en",
"group_segments": true
}
}'
Voice Conversion with RVC v2
curl -X POST https://api.eachlabs.ai/v1/prediction \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-d '{
"model": "rvc-v2",
"version": "0.0.1",
"input": {
"input_audio": "https://example.com/vocals.wav",
"rvc_model": "CUSTOM",
"custom_rvc_model_download_url": "https://example.com/my-voice-model.zip",
"pitch_change": 0,
"output_format": "wav"
}
}'
Merge Audio with Video
curl -X POST https://api.eachlabs.ai/v1/prediction \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-d '{
"model": "ffmpeg-api-merge-audio-video",
"version": "0.0.1",
"input": {
"video_url": "https://example.com/video.mp4",
"audio_url": "https://example.com/narration.mp3",
"start_offset": 0
}
}'
ElevenLabs Voice IDs
The elevenlabs-text-to-speech model supports these voice IDs. Pass the raw ID string:
| Voice ID | Notes |
|---|---|
| EXAVITQu4vr4xnSDxMaL | Default voice |
| 9BWtsMINqrJLrRacOk9x | — |
| CwhRBWXzGAHq8TQ4Fs17 | — |
| FGY2WhTYpPnrIDTdsKH5 | — |
| JBFqnCBsd6RMkjVDRZzb | — |
| N2lVS1w4EtoT3dr4eOWO | — |
| TX3LPaxmHKxFdv7VOQHJ | — |
| XB0fDUnXU5powFXDhCwa | — |
| onwK4e9ZLuTAKqWW03F9 | — |
| pFZP5JQG7iQjIQuC4Bku | — |
Parameter Reference
See references/MODELS.md for complete parameter details for each model.
Installation
openclaw install eachlabs-voice-audio
💻Code Examples
Header: X-API-Key: <your-api-key>
Set the `EACHLABS_API_KEY` environment variable. Get your key at [eachlabs.ai](https://eachlabs.ai).
## Available Models
### Text-to-Speech
| Model | Slug | Best For |
|-------|------|----------|
| ElevenLabs TTS | `elevenlabs-text-to-speech` | High quality TTS |
| ElevenLabs TTS w/ Timestamps | `elevenlabs-text-to-speech-with-timestamp` | TTS with word timing |
| ElevenLabs Text to Dialogue | `elevenlabs-text-to-dialogue` | Multi-speaker dialogue |
| ElevenLabs Sound Effects | `elevenlabs-sound-effects` | Sound effect generation |
| ElevenLabs Voice Design v2 | `elevenlabs-voice-design-v2` | Custom voice design |
| Kling V1 TTS | `kling-v1-tts` | Kling text-to-speech |
| Kokoro 82M | `kokoro-82m` | Lightweight TTS |
| Play AI Dialog | `play-ai-text-to-speech-dialog` | Dialog TTS |
| Stable Audio 2.5 | `stable-audio-2-5-text-to-audio` | Text to audio |
### Speech-to-Text
| Model | Slug | Best For |
|-------|------|----------|
| ElevenLabs Scribe v2 | `elevenlabs-speech-to-text-scribe-v2` | Best quality transcription |
| ElevenLabs STT | `elevenlabs-speech-to-text` | Standard transcription |
| Wizper with Timestamp | `wizper-with-timestamp` | Timestamped transcription |
| Wizper | `wizper` | Basic transcription |
| Whisper | `whisper` | Open-source transcription |
| Whisper Diarization | `whisper-diarization` | Speaker identification |
| Incredibly Fast Whisper | `incredibly-fast-whisper` | Fastest transcription |
### Voice Conversion & Cloning
| Model | Slug | Best For |
|-------|------|----------|
| RVC v2 | `rvc-v2` | Voice conversion |
| Train RVC | `train-rvc` | Train custom voice model |
| ElevenLabs Voice Clone | `elevenlabs-voice-clone` | Voice cloning |
| ElevenLabs Voice Changer | `elevenlabs-voice-changer` | Voice transformation |
| ElevenLabs Voice Design v3 | `elevenlabs-voice-design-v3` | Advanced voice design |
| ElevenLabs Dubbing | `elevenlabs-dubbing` | Video dubbing |
| Chatterbox S2S | `chatterbox-speech-to-speech` | Speech to speech |
| Open Voice | `openvoice` | Open-source voice clone |
| XTTS v2 | `xtts-v2` | Multi-language voice clone |
| Stable Audio 2.5 Inpaint | `stable-audio-2-5-inpaint` | Audio inpainting |
| Stable Audio 2.5 A2A | `stable-audio-2-5-audio-to-audio` | Audio transformation |
| Audio Trimmer | `audio-trimmer-with-fade` | Audio trimming with fade |
### Audio Utilities
| Model | Slug | Best For |
|-------|------|----------|
| FFmpeg Merge Audio Video | `ffmpeg-api-merge-audio-video` | Merge audio with video |
| Toolkit Video Convert | `toolkit` | Video/audio conversion |
## Prediction Flow
1. **Check model** `GET https://api.eachlabs.ai/v1/model?slug=<slug>` — validates the model exists and returns the `request_schema` with exact input parameters. Always do this before creating a prediction to ensure correct inputs.
2. **POST** `https://api.eachlabs.ai/v1/prediction` with model slug, version `"0.0.1"`, and input matching the schema
3. **Poll** `GET https://api.eachlabs.ai/v1/prediction/{id}` until status is `"success"` or `"failed"`
4. **Extract** the output from the response
## Examples
### Text-to-Speech with ElevenLabscurl -X POST https://api.eachlabs.ai/v1/prediction \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-d '{
"model": "elevenlabs-text-to-speech",
"version": "0.0.1",
"input": {
"text": "Welcome to our product demo. Today we will walk through the key features.",
"voice_id": "EXAVITQu4vr4xnSDxMaL",
"model_id": "eleven_v3",
"stability": 0.5,
"similarity_boost": 0.7
}
}'curl -X POST https://api.eachlabs.ai/v1/prediction \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-d '{
"model": "elevenlabs-speech-to-text-scribe-v2",
"version": "0.0.1",
"input": {
"media_url": "https://example.com/recording.mp3",
"diarize": true,
"timestamps_granularity": "word"
}
}'curl -X POST https://api.eachlabs.ai/v1/prediction \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-d '{
"model": "wizper-with-timestamp",
"version": "0.0.1",
"input": {
"audio_url": "https://example.com/audio.mp3",
"language": "en",
"task": "transcribe",
"chunk_level": "segment"
}
}'curl -X POST https://api.eachlabs.ai/v1/prediction \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-d '{
"model": "whisper-diarization",
"version": "0.0.1",
"input": {
"file_url": "https://example.com/meeting.mp3",
"num_speakers": 3,
"language": "en",
"group_segments": true
}
}'curl -X POST https://api.eachlabs.ai/v1/prediction \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-d '{
"model": "rvc-v2",
"version": "0.0.1",
"input": {
"input_audio": "https://example.com/vocals.wav",
"rvc_model": "CUSTOM",
"custom_rvc_model_download_url": "https://example.com/my-voice-model.zip",
"pitch_change": 0,
"output_format": "wav"
}
}'curl -X POST https://api.eachlabs.ai/v1/prediction \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-d '{
"model": "ffmpeg-api-merge-audio-video",
"version": "0.0.1",
"input": {
"video_url": "https://example.com/video.mp4",
"audio_url": "https://example.com/narration.mp3",
"start_offset": 0
}
}'Tags
Quick Info
Ready to Install?
Get started with this skill in seconds
Related Skills
4claw
4claw — a moderated imageboard for AI agents.
Aap Passport
Agent Attestation Protocol - The Reverse Turing Test.
Adaptive Suite
A continuously adaptive skill suite that empowers Clawdbot.
Adversarial Prompting
Adversarial analysis to critique, fix.