✓ Verified ✍️ Content Creation ✓ Enhanced Data

Clonev

Clone any voice and generate speech using Coqui XTTS v2.

Rating
3.9 (71 reviews)
Downloads
2,219 downloads
Version
1.0.0

Overview

Clone any voice and generate speech using Coqui XTTS v2.

Key Features

1

Get the required inputs

2

Run the script

3

Use the output

Complete Documentation

View Source →

CloneV Skill - Voice Cloning Made Simple

⚠️ CRITICAL INSTRUCTIONS FOR AI MODELS

DO NOT try to use Docker containers directly. DO NOT try to interact with coqui-xtts container - it is broken and restarting. DO NOT try to use APIs or servers.

ONLY USE THE SCRIPT: scripts/clonev.sh

The script handles everything automatically. Just call it with text, voice sample, and language.


What This Skill Does

Clones any voice from a short audio sample and generates new speech in that voice.

Input:

  • Text to speak
  • Voice sample (WAV file, 6-30 seconds)
  • Language code
Output: OGG voice file (cloned voice speaking the text)

Works with: Any voice! Yours, a celebrity, a character, etc.


The ONE Command You Need

bash
$(scripts/clonev.sh "Your text here" /path/to/voice_sample.wav language)

That's it! Nothing else needed.


Step-by-Step Usage (FOR AI MODELS)

Step 1: Get the required inputs

  • Text to speak (from user)
  • Path to voice sample WAV file (from user)
  • Language code (from user or default to en)

Step 2: Run the script

bash
VOICE_FILE=$(scripts/clonev.sh "TEXT_HERE" "/path/to/sample.wav" LANGUAGE)

Step 3: Use the output

The variable $VOICE_FILE now contains the path to the generated OGG file.


Complete Working Examples

Example 1: Clone voice and send to Telegram

bash
# Generate cloned voice
VOICE=$(/home/bernie/clawd/skills/clonev/scripts/clonev.sh "Hello, this is my cloned voice!" "/mnt/c/TEMP/Recording 25.wav" en)

# Send to Telegram (as voice message)
message action=send channel=telegram asVoice=true filePath="$VOICE"

Example 2: Clone voice in Czech

bash
# Generate Czech voice
VOICE=$(/home/bernie/clawd/skills/clonev/scripts/clonev.sh "Ahoj, tohle je můj hlas" "/mnt/c/TEMP/Recording 25.wav" cs)

# Send
message action=send channel=telegram asVoice=true filePath="$VOICE"

Example 3: Full workflow with check

bash
#!/bin/bash

# Generate voice
VOICE=$(/home/bernie/clawd/skills/clonev/scripts/clonev.sh "Task completed!" "/path/to/sample.wav" en)

# Verify file was created
if [ -f "$VOICE" ]; then
    echo "Success! Voice file: $VOICE"
    ls -lh "$VOICE"
else
    echo "Error: Voice file not created"
fi


Common Language Codes

CodeLanguageExample Usage
enEnglishscripts/clonev.sh "Hello" sample.wav en
csCzechscripts/clonev.sh "Ahoj" sample.wav cs
deGermanscripts/clonev.sh "Hallo" sample.wav de
frFrenchscripts/clonev.sh "Bonjour" sample.wav fr
esSpanishscripts/clonev.sh "Hola" sample.wav es
Full list: en, cs, de, fr, es, it, pl, pt, tr, ru, nl, ar, zh, ja, hu, ko


Voice Sample Requirements

  • Format: WAV file
  • Length: 6-30 seconds (optimal: 10-15 seconds)
  • Quality: Clear audio, no background noise
  • Content: Any speech (the actual words don't matter)
Good samples:
  • ✅ Recording of someone speaking clearly
  • ✅ No music or noise in background
  • ✅ Consistent volume
Bad samples:
  • ❌ Music or songs
  • ❌ Heavy background noise
  • ❌ Very short (< 6 seconds)
  • ❌ Very long (> 30 seconds)

⚠️ Important Notes

Model Download

  • First use downloads ~1.87GB model (one-time)
  • Model is stored at: /mnt/c/TEMP/Docker-containers/coqui-tts/models-xtts/
  • Status: ✅ Already downloaded

Processing Time

  • Takes 20-40 seconds depending on text length
  • This is normal - voice cloning is computationally intensive

Troubleshooting

"Command not found"

Make sure you're in the skill directory or use full path:
bash
/home/bernie/clawd/skills/clonev/scripts/clonev.sh "text" sample.wav en

"Voice sample not found"

  • Check the path to the WAV file
  • Use absolute paths (starting with /)
  • Ensure file exists: ls -la /path/to/sample.wav

"Model not found"

The model should auto-download. If not:
bash
cd /mnt/c/TEMP/Docker-containers/coqui-tts
docker run --rm --entrypoint "" \
  -v $(pwd)/models-xtts:/root/.local/share/tts \
  ghcr.io/coqui-ai/tts:latest \
  python3 -c "from TTS.api import TTS; TTS('tts_models/multilingual/multi-dataset/xtts_v2')"

Poor voice quality

  • Use clearer voice sample
  • Ensure no background noise
  • Try different sample (some voices clone better)

Quick Reference Card (FOR AI MODELS)

text
USER: "Clone my voice and say 'hello'"
→ Get: sample path, text="hello", language="en"
→ Run: VOICE=$(/home/bernie/clawd/skills/clonev/scripts/clonev.sh "hello" "/path/to/sample.wav" en)
→ Result: $VOICE contains path to OGG file
→ Send: message action=send channel=telegram asVoice=true filePath="$VOICE"

text
USER: "Make me speak Czech"
→ Get: sample path, text="Ahoj", language="cs"  
→ Run: VOICE=$(/home/bernie/clawd/skills/clonev/scripts/clonev.sh "Ahoj" "/path/to/sample.wav" cs)
→ Send: message action=send channel=telegram asVoice=true filePath="$VOICE"


Output Location

Generated files are saved to:

text
/mnt/c/TEMP/Docker-containers/coqui-tts/output/clonev_output.ogg

The script returns this path, so you can use it directly.


Summary

  • ONLY use the script: scripts/clonev.sh
  • NEVER try to use Docker containers directly
  • NEVER try to interact with the coqui-xtts container
  • Script handles everything automatically
  • Returns path to OGG file ready to send
Simple. Just use the script.


Clone any voice. Speak any language. Just use the script.

Installation

Terminal bash

openclaw install clonev
    
Copied!

💻Code Examples

$(scripts/clonev.sh "Your text here" /path/to/voice_sample.wav language)

scriptsclonevsh-your-text-here-pathtovoicesamplewav-language.txt
That's it! Nothing else needed.

---

## Step-by-Step Usage (FOR AI MODELS)

### Step 1: Get the required inputs
- Text to speak (from user)
- Path to voice sample WAV file (from user)
- Language code (from user or default to `en`)

### Step 2: Run the script

VOICE_FILE=$(scripts/clonev.sh "TEXT_HERE" "/path/to/sample.wav" LANGUAGE)

voicefilescriptsclonevsh-texthere-pathtosamplewav-language.txt
### Step 3: Use the output
The variable `$VOICE_FILE` now contains the path to the generated OGG file.

---

## Complete Working Examples

### Example 1: Clone voice and send to Telegram

fi

fi.txt
---

## Common Language Codes

| Code | Language | Example Usage |
|------|----------|---------------|
| `en` | English | `scripts/clonev.sh "Hello" sample.wav en` |
| `cs` | Czech | `scripts/clonev.sh "Ahoj" sample.wav cs` |
| `de` | German | `scripts/clonev.sh "Hallo" sample.wav de` |
| `fr` | French | `scripts/clonev.sh "Bonjour" sample.wav fr` |
| `es` | Spanish | `scripts/clonev.sh "Hola" sample.wav es` |

Full list: en, cs, de, fr, es, it, pl, pt, tr, ru, nl, ar, zh, ja, hu, ko

---

## Voice Sample Requirements

- **Format**: WAV file
- **Length**: 6-30 seconds (optimal: 10-15 seconds)
- **Quality**: Clear audio, no background noise
- **Content**: Any speech (the actual words don't matter)

**Good samples**:
- ✅ Recording of someone speaking clearly
- ✅ No music or noise in background
- ✅ Consistent volume

**Bad samples**:
- ❌ Music or songs
- ❌ Heavy background noise
- ❌ Very short (< 6 seconds)
- ❌ Very long (> 30 seconds)

---

## ⚠️ Important Notes

### Model Download
- First use downloads ~1.87GB model (one-time)
- Model is stored at: `/mnt/c/TEMP/Docker-containers/coqui-tts/models-xtts/`
- Status: ✅ Already downloaded

### Processing Time
- Takes 20-40 seconds depending on text length
- This is normal - voice cloning is computationally intensive

---

## Troubleshooting

### "Command not found"
Make sure you're in the skill directory or use full path:

/home/bernie/clawd/skills/clonev/scripts/clonev.sh "text" sample.wav en

homebernieclawdskillsclonevscriptsclonevsh-text-samplewav-en.txt
### "Voice sample not found"
- Check the path to the WAV file
- Use absolute paths (starting with `/`)
- Ensure file exists: `ls -la /path/to/sample.wav`

### "Model not found"
The model should auto-download. If not:

python3 -c "from TTS.api import TTS; TTS('tts_models/multilingual/multi-dataset/xtts_v2')"

-python3--c-from-ttsapi-import-tts-ttsttsmodelsmultilingualmulti-datasetxttsv2.txt
### Poor voice quality
- Use clearer voice sample
- Ensure no background noise
- Try different sample (some voices clone better)

---

## Quick Reference Card (FOR AI MODELS)

→ Send: message action=send channel=telegram asVoice=true filePath="$VOICE"

-send-message-actionsend-channeltelegram-asvoicetrue-filepathvoice.txt
---

## Output Location

Generated files are saved to:
example.sh
# Generate cloned voice
VOICE=$(/home/bernie/clawd/skills/clonev/scripts/clonev.sh "Hello, this is my cloned voice!" "/mnt/c/TEMP/Recording 25.wav" en)

# Send to Telegram (as voice message)
message action=send channel=telegram asVoice=true filePath="$VOICE"
example.sh
# Generate Czech voice
VOICE=$(/home/bernie/clawd/skills/clonev/scripts/clonev.sh "Ahoj, tohle je můj hlas" "/mnt/c/TEMP/Recording 25.wav" cs)

# Send
message action=send channel=telegram asVoice=true filePath="$VOICE"
example.sh
#!/bin/bash

# Generate voice
VOICE=$(/home/bernie/clawd/skills/clonev/scripts/clonev.sh "Task completed!" "/path/to/sample.wav" en)

# Verify file was created
if [ -f "$VOICE" ]; then
    echo "Success! Voice file: $VOICE"
    ls -lh "$VOICE"
else
    echo "Error: Voice file not created"
fi
example.sh
cd /mnt/c/TEMP/Docker-containers/coqui-tts
docker run --rm --entrypoint "" \
  -v $(pwd)/models-xtts:/root/.local/share/tts \
  ghcr.io/coqui-ai/tts:latest \
  python3 -c "from TTS.api import TTS; TTS('tts_models/multilingual/multi-dataset/xtts_v2')"

Tags

#speech_and-transcription

Quick Info

Category Content Creation
Model Claude 3.5
Complexity One-Click
Author instant-picture
Last Updated 3/10/2026
🚀
Optimized for
Claude 3.5
🧠

Ready to Install?

Get started with this skill in seconds

openclaw install clonev