✓ Verified 🌐 Web Scrapers ✓ Enhanced Data

Audiomind

The ultimate AI audio generation skill.

Rating: 4.7 (366 reviews)
Downloads: 1,463 downloads
Version: 1.0.0

Overview

The ultimate AI audio generation skill.

Complete Documentation

View Source →

AudioMind v3: The AI Podcast Studio

AudioMind turns a single sentence into a fully-produced podcast. It handles scripting, ElevenLabs voice narration, AI background music, and server-side audio mixing — all from one Manus command.

No setup required. The public shared backend works out of the box. Just install and start creating.

Quick Start

Install:

text

clawhub install audiomind

Use immediately (no configuration needed):

"Use AudioMind to create a 3-minute podcast about the future of AI agents."

That's it. AudioMind uses the public shared backend by default — 20 free generations per month, no API key required.

Configuration

Variable	Required	Description
AUDIOMIND_BACKEND_URL	Optional	Your own Vercel backend URL. Defaults to the public shared backend.
AUDIOMIND_API_KEY	Optional	Pro API key for unlimited generations. Get one at the landing page.

Free Tier (default): 20 generations/month tracked by IP. No configuration needed.

Pro Tier: Set AUDIOMIND_API_KEY with your Pro key for unlimited access.

Self-hosted: Deploy your own backend from github.com/wells1137/audiomind-backend and set AUDIOMIND_BACKEND_URL to your instance.

How It Works

When you ask Manus to create a podcast, the agent performs these steps automatically:

Write Script — The agent uses its built-in LLM to write a structured podcast script based on your topic and desired length.
Generate Narration — POST {BACKEND_URL}/api/workflow/generate_tts with the script. Returns MP3 audio narrated by an ElevenLabs voice.
Generate Music — POST {BACKEND_URL}/api/workflow/generate_music with a mood/style prompt. Returns a background music MP3.
Upload Audio — The agent uploads both MP3 files using manus-upload-file to obtain public URLs for the mixing step.
Mix Final Audio — POST {BACKEND_URL}/api/workflow/mix_audio with { narration_url, music_url }. The backend mixes them with proper levels using ffmpeg and returns the final podcast MP3.
Deliver — The agent saves and presents the finished podcast to you.

Example Prompts

"Create a 5-minute podcast about the history of jazz with a smooth jazz background."
"Make a daily news briefing about AI developments, formal tone, upbeat intro music."
"Generate a meditation podcast, 10 minutes, calm narration, ambient soundscape."
"Produce a tech explainer on quantum computing for a general audience."

Security

All API keys (ElevenLabs) are stored server-side. The skill file contains zero credentials. This architecture passes VirusTotal and ClawHub security scans. See the GitHub repo for the full backend source code.

Changelog

v3.3.0 — Removed local tools/start_server.sh entirely (not needed in v3 architecture). Declared FAL_KEY as optional env. Resolves all OpenClaw metadata inconsistency warnings.

v3.1.0 — Zero-config install. Public shared backend is now the default. No AUDIOMIND_BACKEND_URL setup required for free tier users.

v3.0.1 — Added openclaw.requires metadata to declare env vars and trusted network endpoints. Resolves OpenClaw security scanner warning.

v3.0.0 — Full architecture rewrite. All commercial logic moved to Vercel backend. ElevenLabs API keys are now server-side only. Passes VirusTotal security scan.

Installation

Terminal bash


openclaw install audiomind

Copied!

⚙️Configuration Options

Option	Type	Default	Description
`AUDIOMIND_BACKEND_URL`	string	`Optional`	Your own Vercel backend URL. Defaults to the public shared backend.
`AUDIOMIND_API_KEY`	string	`Optional`	Pro API key for unlimited generations. Get one at the landing page.