✓ Verified 💻 Development ✓ Enhanced Data

Livekit

Build voice AI agents with LiveKit.

Rating: 4.9 (125 reviews)
Downloads: 4,359 downloads
Version: 1.0.0

Overview

Build voice AI agents with LiveKit.

Complete Documentation

View Source →

LiveKit Voice AI Skill

Build production voice agents with LiveKit's open-source platform.

Quick Start

bash

# Install SDK
pip install livekit-agents livekit-plugins-openai livekit-plugins-deepgram livekit-plugins-cartesia

# Or Node.js
npm install @livekit/agents @livekit/agents-plugin-openai

Minimal Agent (Python)

python

from livekit.agents import AgentSession, JobContext, WorkerOptions, cli
from livekit.plugins import deepgram, openai, cartesia

async def entrypoint(ctx: JobContext):
    await ctx.connect()
    
    session = AgentSession(
        stt=deepgram.STT(),
        llm=openai.LLM(model="gpt-4.1-mini"),
        tts=cartesia.TTS(),
    )
    
    session.start(ctx.room)
    await session.say("Hello! How can I help?")

if __name__ == "__main__":
    cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint))

Provider Selection

Component	Budget	Quality	Low Latency
STT	Deepgram Nova-3	AssemblyAI	Deepgram Keychain
LLM	GPT-4.1 mini	Claude Sonnet	GPT-4.1 mini
TTS	Cartesia Sonic-3	ElevenLabs	Cartesia Sonic-3

Voice Pipeline vs Realtime

STT-LLM-TTS Pipeline:

More control, mix providers
Generally cheaper
Easier to debug

OpenAI Realtime API:

Speech-to-speech, more expressive
Higher cost (~$0.10/min)
Less control

Environment Variables

bash

LIVEKIT_URL=wss://your-app.livekit.cloud
LIVEKIT_API_KEY=your-api-key
LIVEKIT_API_SECRET=your-api-secret

# Provider keys (if not using LiveKit Inference)
OPENAI_API_KEY=
DEEPGRAM_API_KEY=
CARTESIA_API_KEY=
ELEVENLABS_API_KEY=

Tool Use

python

from livekit.agents import function_tool

@function_tool()
async def get_weather(location: str) -> str:
    """Get current weather for a location."""
    # Your implementation
    return f"Weather in {location}: 72°F, sunny"

session = AgentSession(
    stt=deepgram.STT(),
    llm=openai.LLM(),
    tts=cartesia.TTS(),
    tools=[get_weather],
)

Telephony (SIP)

python

from livekit import api

# Outbound call
await lk_api.sip.create_sip_participant(
    api.CreateSIPParticipantRequest(
        sip_trunk_id="trunk-id",
        sip_call_to="+15551234567",
        room_name="my-room",
    )
)

Deployment

LiveKit Cloud: livekit-server-cli deploy --project my-project

Self-hosted:

bash

docker run -d \
  -p 7880:7880 -p 7881:7881 -p 7882:7882/udp \
  -e LIVEKIT_KEYS="api-key: api-secret" \
  livekit/livekit-server

Cost Estimates

Scenario	Monthly Cost
Dev/testing	Free tier
100 hrs/mo voice	~$150-250
Production B2B	~$300-500
High volume	Self-host

Common Patterns

Turn Detection

python

session = AgentSession(
    turn_detection=openai.TurnDetection(
        threshold=0.5,
        silence_duration_ms=500,
    ),
    ...
)

Interruption Handling

python

@session.on("user_speech_started")
async def handle_interruption():
    session.stop_speaking()

Multi-Agent Handoff

python

await session.transfer_to(specialist_agent)

References

Docs: https://docs.livekit.io/agents/
Examples: https://github.com/livekit/agents/tree/main/examples
Playground: https://agents-playground.livekit.io

Installation

Terminal bash


openclaw install livekit

Copied!

💻Code Examples

cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint))

-clirunappworkeroptionsentrypointfncentrypoint.txt

## Provider Selection

| Component | Budget | Quality | Low Latency |
|-----------|--------|---------|-------------|
| **STT** | Deepgram Nova-3 | AssemblyAI | Deepgram Keychain |
| **LLM** | GPT-4.1 mini | Claude Sonnet | GPT-4.1 mini |
| **TTS** | Cartesia Sonic-3 | ElevenLabs | Cartesia Sonic-3 |

## Voice Pipeline vs Realtime

**STT-LLM-TTS Pipeline:**
- More control, mix providers
- Generally cheaper
- Easier to debug

**OpenAI Realtime API:**
- Speech-to-speech, more expressive
- Higher cost (~$0.10/min)
- Less control

## Environment Variables

)

.txt

## Deployment

**LiveKit Cloud:** `livekit-server-cli deploy --project my-project`

**Self-hosted:**

livekit/livekit-server

-livekitlivekit-server.txt

## Cost Estimates

| Scenario | Monthly Cost |
|----------|--------------|
| Dev/testing | Free tier |
| 100 hrs/mo voice | ~$150-250 |
| Production B2B | ~$300-500 |
| High volume | Self-host |

## Common Patterns

### Turn Detection

example.sh

# Install SDK
pip install livekit-agents livekit-plugins-openai livekit-plugins-deepgram livekit-plugins-cartesia

# Or Node.js
npm install @livekit/agents @livekit/agents-plugin-openai

example.py

from livekit.agents import AgentSession, JobContext, WorkerOptions, cli
from livekit.plugins import deepgram, openai, cartesia

async def entrypoint(ctx: JobContext):
    await ctx.connect()
    
    session = AgentSession(
        stt=deepgram.STT(),
        llm=openai.LLM(model="gpt-4.1-mini"),
        tts=cartesia.TTS(),
    )
    
    session.start(ctx.room)
    await session.say("Hello! How can I help?")

if __name__ == "__main__":
    cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint))

example.sh

LIVEKIT_URL=wss://your-app.livekit.cloud
LIVEKIT_API_KEY=your-api-key
LIVEKIT_API_SECRET=your-api-secret

# Provider keys (if not using LiveKit Inference)
OPENAI_API_KEY=
DEEPGRAM_API_KEY=
CARTESIA_API_KEY=
ELEVENLABS_API_KEY=

example.py

from livekit.agents import function_tool

@function_tool()
async def get_weather(location: str) -> str:
    """Get current weather for a location."""
    # Your implementation
    return f"Weather in {location}: 72°F, sunny"

session = AgentSession(
    stt=deepgram.STT(),
    llm=openai.LLM(),
    tts=cartesia.TTS(),
    tools=[get_weather],
)

example.py

from livekit import api

# Outbound call
await lk_api.sip.create_sip_participant(
    api.CreateSIPParticipantRequest(
        sip_trunk_id="trunk-id",
        sip_call_to="+15551234567",
        room_name="my-room",
    )
)

example.sh

docker run -d \
  -p 7880:7880 -p 7881:7881 -p 7882:7882/udp \
  -e LIVEKIT_KEYS="api-key: api-secret" \
  livekit/livekit-server

example.py

session = AgentSession(
    turn_detection=openai.TurnDetection(
        threshold=0.5,
        silence_duration_ms=500,
    ),
    ...
)

Related Skills

✓ Verified 💻 Development

4claw

4claw — a moderated imageboard for AI agents.

🧠 Claude-Ready #ai_and-llms

✓ Verified 💻 Development

Aap Passport

Agent Attestation Protocol - The Reverse Turing Test.

🧠 Claude-Ready #ai_and-llms

✓ Verified 💻 Development

Acestep Lyrics Transcription

Transcribe audio to timestamped lyrics using OpenAI Whisper or ElevenLabs Scribe API.

⚡ GPT-Optimized #ai_and-llms #api #script

✓ Verified 💻 Development

Adaptive Suite

A continuously adaptive skill suite that empowers Clawdbot.

🧠 Claude-Ready #ai_and-llms #bot

Livekit

Overview

Complete Documentation

LiveKit Voice AI Skill

Quick Start

Minimal Agent (Python)

Provider Selection

Voice Pipeline vs Realtime

Environment Variables

Tool Use

Telephony (SIP)

Deployment

Cost Estimates

Common Patterns

Turn Detection

Interruption Handling

Multi-Agent Handoff

References

Installation

💻Code Examples

cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint))

)

livekit/livekit-server

Tags

Quick Info

Ready to Install?

Resources

Related Skills

4claw

Aap Passport

Acestep Lyrics Transcription

Adaptive Suite