Audiomind
The ultimate AI audio generation skill.
- Rating
- 4.7 (366 reviews)
- Downloads
- 1,463 downloads
- Version
- 1.0.0
Overview
The ultimate AI audio generation skill.
Complete Documentation
View Source →
AudioMind v3: The AI Podcast Studio
AudioMind turns a single sentence into a fully-produced podcast. It handles scripting, ElevenLabs voice narration, AI background music, and server-side audio mixing — all from one Manus command.
No setup required. The public shared backend works out of the box. Just install and start creating.
Quick Start
Install:
clawhub install audiomind
Use immediately (no configuration needed):
"Use AudioMind to create a 3-minute podcast about the future of AI agents."
That's it. AudioMind uses the public shared backend by default — 20 free generations per month, no API key required.
Configuration
| Variable | Required | Description |
|---|---|---|
| AUDIOMIND_BACKEND_URL | Optional | Your own Vercel backend URL. Defaults to the public shared backend. |
| AUDIOMIND_API_KEY | Optional | Pro API key for unlimited generations. Get one at the landing page. |
Pro Tier: Set AUDIOMIND_API_KEY with your Pro key for unlimited access.
Self-hosted: Deploy your own backend from github.com/wells1137/audiomind-backend and set AUDIOMIND_BACKEND_URL to your instance.
How It Works
When you ask Manus to create a podcast, the agent performs these steps automatically:
- Write Script — The agent uses its built-in LLM to write a structured podcast script based on your topic and desired length.
- Generate Narration —
POST {BACKEND_URL}/api/workflow/generate_ttswith the script. Returns MP3 audio narrated by an ElevenLabs voice. - Generate Music —
POST {BACKEND_URL}/api/workflow/generate_musicwith a mood/style prompt. Returns a background music MP3. - Upload Audio — The agent uploads both MP3 files using
manus-upload-fileto obtain public URLs for the mixing step. - Mix Final Audio —
POST {BACKEND_URL}/api/workflow/mix_audiowith{ narration_url, music_url }. The backend mixes them with proper levels using ffmpeg and returns the final podcast MP3. - Deliver — The agent saves and presents the finished podcast to you.
Example Prompts
- "Create a 5-minute podcast about the history of jazz with a smooth jazz background."
- "Make a daily news briefing about AI developments, formal tone, upbeat intro music."
- "Generate a meditation podcast, 10 minutes, calm narration, ambient soundscape."
- "Produce a tech explainer on quantum computing for a general audience."
Security
All API keys (ElevenLabs) are stored server-side. The skill file contains zero credentials. This architecture passes VirusTotal and ClawHub security scans. See the GitHub repo for the full backend source code.
Changelog
v3.3.0 — Removed local tools/start_server.sh entirely (not needed in v3 architecture). Declared FAL_KEY as optional env. Resolves all OpenClaw metadata inconsistency warnings.
v3.1.0 — Zero-config install. Public shared backend is now the default. No AUDIOMIND_BACKEND_URL setup required for free tier users.
v3.0.1 — Added openclaw.requires metadata to declare env vars and trusted network endpoints. Resolves OpenClaw security scanner warning.
v3.0.0 — Full architecture rewrite. All commercial logic moved to Vercel backend. ElevenLabs API keys are now server-side only. Passes VirusTotal security scan.
Installation
openclaw install audiomind
⚙️Configuration Options
| Option | Type | Default | Description |
|---|---|---|---|
AUDIOMIND_BACKEND_URL | string | Optional | Your own Vercel backend URL. Defaults to the public shared backend. |
AUDIOMIND_API_KEY | string | Optional | Pro API key for unlimited generations. Get one at the landing page. |
Tags
Quick Info
Ready to Install?
Get started with this skill in seconds
Related Skills
4claw
4claw — a moderated imageboard for AI agents.
Aap Passport
Agent Attestation Protocol - The Reverse Turing Test.
Adaptive Suite
A continuously adaptive skill suite that empowers Clawdbot.
Adversarial Prompting
Adversarial analysis to critique, fix.