✓ Verified 💻 Development ✓ Enhanced Data

Ugc Manual

Generate lip-sync video from image + user's own audio recording. ✅ USE WHEN: - User provides their

Rating
4.9 (103 reviews)
Downloads
31,731 downloads
Version
1.0.0

Overview

Generate lip-sync video from image + user's own audio recording. ✅ USE WHEN: - User provides their OWN audio file.

Complete Documentation

View Source →

UGC-Manual

Generate lip-sync videos by combining an image with a custom audio file using ComfyDeploy's UGC-MANUAL workflow.

Overview

UGC-Manual takes:

  • An image (person/character with visible face)
  • An audio file (user's voice recording)
And produces a video where the person in the image lip-syncs to the audio.

API Details

Endpoint: https://api.comfydeploy.com/api/run/deployment/queue Deployment ID: 075ce7d3-81a6-4e3e-ab0e-7a25edf601b5

Required Inputs

InputDescriptionFormats
imageImage with a visible faceJPG, PNG
input_audioAudio file to lip-syncMP3, WAV, OGG

Usage

bash
uv run ~/.clawdbot/skills/ugc-manual/scripts/generate.py \
  --image "path/to/image.jpg" \
  --audio "path/to/audio.mp3" \
  --output "output-video.mp4"

With URLs:

bash
uv run ~/.clawdbot/skills/ugc-manual/scripts/generate.py \
  --image "https://example.com/image.jpg" \
  --audio "https://example.com/audio.mp3" \
  --output "result.mp4"

Workflow Integration

Typical Use Cases

  • Custom voice recordings - User records their own audio via Telegram/WhatsApp
  • Pre-generated TTS - Audio generated externally (ElevenLabs, etc.)
  • Music/sound sync - Sync mouth movements to any audio

Example Pipeline

bash
# 1. Convert Telegram voice message to MP3 (if needed)
ffmpeg -i voice.ogg -acodec libmp3lame -q:a 2 voice.mp3

# 2. Generate lip-sync video
uv run ugc-manual... --image face.jpg --audio voice.mp3 --output video.mp4

Difference from VEED-UGC

FeatureUGC-ManualVEED-UGC
Audio sourceUser providesGenerated from brief
ScriptN/AAuto-generated
VoiceUser's recordingElevenLabs TTS
Use caseCustom audioAutomated content

Notes

  • Image should have a clearly visible face (frontal or 3/4 view)
  • Audio quality affects output quality
  • Processing time: ~2-5 minutes depending on audio length
  • Audio auto-conversion: The script automatically converts any audio format (MP3, OGG, M4A, etc.) to WAV PCM 16-bit mono 48kHz before sending to FabricLipsync
  • Requires ffmpeg installed on the system

Installation

Terminal bash

openclaw install ugc-manual
    
Copied!

💻Code Examples

--output "result.mp4"

---output-resultmp4.txt
## Workflow Integration

### Typical Use Cases

1. **Custom voice recordings** - User records their own audio via Telegram/WhatsApp
2. **Pre-generated TTS** - Audio generated externally (ElevenLabs, etc.)
3. **Music/sound sync** - Sync mouth movements to any audio

### Example Pipeline
example.sh
uv run ~/.clawdbot/skills/ugc-manual/scripts/generate.py \
  --image "path/to/image.jpg" \
  --audio "path/to/audio.mp3" \
  --output "output-video.mp4"
example.sh
uv run ~/.clawdbot/skills/ugc-manual/scripts/generate.py \
  --image "https://example.com/image.jpg" \
  --audio "https://example.com/audio.mp3" \
  --output "result.mp4"
example.sh
# 1. Convert Telegram voice message to MP3 (if needed)
ffmpeg -i voice.ogg -acodec libmp3lame -q:a 2 voice.mp3

# 2. Generate lip-sync video
uv run ugc-manual... --image face.jpg --audio voice.mp3 --output video.mp4

Tags

#coding_agents-and-ides

Quick Info

Category Development
Model Claude 3.5
Complexity One-Click
Author pauldelavallaz
Last Updated 3/10/2026
🚀
Optimized for
Claude 3.5
🧠

Ready to Install?

Get started with this skill in seconds

openclaw install ugc-manual