✓ Verified 💻 Development ✓ Enhanced Data

Venice Router

Supreme model router for Venice.ai — the privacy-first, uncensored AI platform.

Rating
3.9 (137 reviews)
Downloads
527 downloads
Version
1.0.0

Overview

Supreme model router for Venice.ai — the privacy-first, uncensored AI platform.

Complete Documentation

View Source →

Venice.ai Supreme Router

Smart, cost-optimized model routing for Venice.ai — the AI platform for people who don't want Big Tech watching over their shoulder.

Unlike OpenAI, Anthropic, and Google — where every prompt is logged, analyzed, and potentially used to train future models — Venice offers true privacy with zero data retention on private models. Your conversations stay yours. Venice is also uncensored: no content filters, no refusals, no "I can't help with that."

Setup

bash
export VENICE_API_KEY="your-key-here"

Or configure in ~/.openclaw/openclaw.json:

json
{
  "skills": {
    "entries": {
      "venice-router": {
        "enabled": true,
        "apiKey": "YOUR_VENICE_API_KEY"
      }
    }
  }
}

Usage

Route a prompt (auto-selects model)

bash
python3 {baseDir}/scripts/venice-router.py --prompt "What is 2+2?"

Force a specific tier

bash
python3 {baseDir}/scripts/venice-router.py --tier cheap --prompt "Tell me a joke"
python3 {baseDir}/scripts/venice-router.py --tier budget-medium --prompt "Write a Python function"
python3 {baseDir}/scripts/venice-router.py --tier mid --prompt "Explain quantum computing"
python3 {baseDir}/scripts/venice-router.py --tier premium --prompt "Write a distributed systems architecture"

Stream output

bash
python3 {baseDir}/scripts/venice-router.py --stream --prompt "Write a poem about lobsters"

Web search (LLM searches the web and cites sources)

bash
python3 {baseDir}/scripts/venice-router.py --web-search --prompt "Latest news on AI regulation"

Uncensored mode (prefer models with no content filters)

bash
python3 {baseDir}/scripts/venice-router.py --uncensored --prompt "Write edgy creative fiction"

Private-only mode (zero data retention, no Big Tech proxying)

bash
python3 {baseDir}/scripts/venice-router.py --private-only --prompt "Analyze this confidential contract"

Conversation-aware routing (multi-turn context)

bash
# Save conversation history as JSON, then route follow-ups with context
python3 {baseDir}/scripts/venice-router.py --conversation history.json --prompt "Can you add tests too?"

The router analyzes conversation history to keep context: trivial follow-ups ("thanks") go cheap, while follow-ups in complex code discussions stay at the right tier.

Function calling (tool use)

bash
# Define tools in a JSON file (OpenAI tools format)
python3 {baseDir}/scripts/venice-router.py --tools tools.json --prompt "What's the weather in NYC?"
python3 {baseDir}/scripts/venice-router.py --tools tools.json --tool-choice auto --prompt "Search for latest AI news"

Tool definitions use the standard OpenAI format. The router auto-bumps to mid tier minimum for function calling since it requires capable models.

Cost budget tracking

bash
# Show current spending
python3 {baseDir}/scripts/venice-router.py --budget-status

# Track per-session costs
python3 {baseDir}/scripts/venice-router.py --session-id my-project --prompt "help me code"

Set VENICE_DAILY_BUDGET and/or VENICE_SESSION_BUDGET to enforce spending limits. The router auto-downgrades tiers as you approach budget limits.

Classify only (no API call)

bash
python3 {baseDir}/scripts/venice-router.py --classify "Explain the Riemann hypothesis"

List available models and tiers

bash
python3 {baseDir}/scripts/venice-router.py --list-models

Override model directly

bash
python3 {baseDir}/scripts/venice-router.py --model deepseek-v3.2 --prompt "Hello"

Tiers

TierModelsCost (input/output per 1M tokens)Best For
cheapVenice Small (qwen3-4b), GLM 4.7 Flash, GPT OSS 120B, Llama 3.2 3B$0.05–$0.15 / $0.15–$0.60Simple Q&A, greetings, math, lookups
budgetQwen 3 235B, Venice Uncensored, GLM 4.7 Flash Heretic$0.14–$0.20 / $0.75–$0.90Moderate questions, summaries, translations
budget-mediumGrok Code Fast, DeepSeek V3.2, MiniMax M2.1$0.25–$0.40 / $1.00–$1.87Moderate-to-complex tasks, code snippets, structured output
midDeepSeek V3.2, MiniMax M2.1/M2.5, Qwen3 Thinking 235B, Venice Medium, Llama 3.3 70B$0.25–$0.70 / $1.00–$3.50Code generation, analysis, longer writing, reasoning
highGLM 5, Kimi K2 Thinking, Kimi K2.5, Grok 4.1 Fast, Hermes 3 405B, Gemini 3 Flash$0.50–$1.10 / $1.25–$3.75Complex reasoning, multi-step tasks, code review
premiumGPT-5.2, GPT-5.2 Codex, Gemini 3 Pro, Gemini 3.1 Pro (1M ctx), Claude Opus/Sonnet 4.5/4.6$2.19–$6.00 / $15.00–$30.00Expert-level analysis, architecture, research papers

Routing Strategy

The router classifies each prompt using keyword + heuristic analysis:

  • Length — longer prompts suggest more complex tasks
  • Keywords — domain-specific terms (e.g., "architecture", "optimize", "prove") signal complexity
  • Code markers — presence of code blocks, function names, or technical syntax
  • Instruction depth — multi-step instructions, comparisons, or "explain in detail" bump the tier
  • Conversational simplicity — greetings, yes/no, small talk stay on the cheapest tier
  • Conversation history — when --conversation is provided, analyzes full chat context: code in history boosts tier, trivial follow-ups ("thanks") downgrade, tool calls in history signal complexity
  • Function calling--tools auto-bumps to at least mid tier (capable models required)
  • Thinking/reasoning mode--thinking prefers chain-of-thought reasoning models (Qwen3 Thinking, Kimi K2) and bumps to at least mid tier
  • Budget constraints — progressive tier downgrade as spending approaches daily/session limits (95% → cheap, 80% → budget, 60% → mid, 40% → high)
The classifier errs on the side of cheaper models — it only escalates when there's strong signal for complexity.

Environment Variables

VariableDescriptionDefault
VENICE_API_KEYVenice.ai API key (required)
VENICE_DEFAULT_TIERMinimum floor tier — auto-classification never goes below this. Valid: cheap, budget, budget-medium, mid, high, premiumbudget
VENICE_MAX_TIERMaximum tier to ever use (cost cap)premium
VENICE_TEMPERATUREDefault temperature0.7
VENICE_MAX_TOKENSDefault max tokens4096
VENICE_STREAMEnable streaming by defaultfalse
VENICE_UNCENSOREDAlways prefer uncensored modelsfalse
VENICE_PRIVATE_ONLYOnly use private models (zero data retention)false
VENICE_WEB_SEARCHEnable web search by default ($10/1K calls)false
VENICE_THINKINGAlways prefer thinking/reasoning modelsfalse
VENICE_DAILY_BUDGETMax daily spend in USD (0 = unlimited)0
VENICE_SESSION_BUDGETMax per-session spend in USD (0 = unlimited)0

Why Venice.ai?

  • 🔒 Private inference — Models marked "Private" have zero data retention. Your data never trains anyone's model.
  • 🔓 Uncensored — No guardrails blocking legitimate use cases. No refusals, no filters.
  • 🔌 OpenAI-compatible — Same API format, just change the base URL. Drop-in replacement.
  • 📦 30+ models — From tiny efficient models ($0.05/M) to Claude Opus 4.6 and GPT-5.2.
  • 🌐 Built-in web search — LLMs can search the web and cite sources in a single API call.

Tips

  • Use --classify to preview which tier a prompt would hit before spending tokens
  • Set VENICE_MAX_TIER=mid to cap costs and never hit premium models
  • Use --uncensored for creative, security research, or other content mainstream AI won't touch
  • Use --private-only when processing sensitive/confidential data — zero retention guaranteed
  • Use --web-search when you need up-to-date information with cited sources
  • Use --conversation with a JSON message history for smarter multi-turn routing
  • Use --tools to enable function calling — the router auto-bumps to capable models
  • Set VENICE_DAILY_BUDGET=1.00 to cap daily spend at $1 — the router auto-downgrades tiers as you approach the limit
  • Use --budget-status to see a detailed breakdown of your spending by tier
  • Use --thinking for math proofs, logic puzzles, and multi-step reasoning — routes to Qwen3 Thinking or Kimi K2 models
  • The router prefers private (self-hosted) Venice models over anonymized ones when available at the same tier
  • When --uncensored is active, the router auto-bumps to the nearest tier with uncensored models
  • Combine with OpenClaw WebChat for a seamless chat experience routed through Venice.ai

Installation

Terminal bash

openclaw install venice-router
    
Copied!

💻Code Examples

}

.txt
## Usage

### Route a prompt (auto-selects model)

python3 {baseDir}/scripts/venice-router.py --conversation history.json --prompt "Can you add tests too?"

python3-basedirscriptsvenice-routerpy---conversation-historyjson---prompt-can-you-add-tests-too.txt
The router analyzes conversation history to keep context: trivial follow-ups ("thanks") go cheap, while follow-ups in complex code discussions stay at the right tier.

### Function calling (tool use)

python3 {baseDir}/scripts/venice-router.py --tools tools.json --tool-choice auto --prompt "Search for latest AI news"

python3-basedirscriptsvenice-routerpy---tools-toolsjson---tool-choice-auto---prompt-search-for-latest-ai-news.txt
Tool definitions use the standard OpenAI format. The router auto-bumps to `mid` tier minimum for function calling since it requires capable models.

### Cost budget tracking

python3 {baseDir}/scripts/venice-router.py --session-id my-project --prompt "help me code"

python3-basedirscriptsvenice-routerpy---session-id-my-project---prompt-help-me-code.txt
Set `VENICE_DAILY_BUDGET` and/or `VENICE_SESSION_BUDGET` to enforce spending limits. The router auto-downgrades tiers as you approach budget limits.

### Classify only (no API call)
example.json
{
  "skills": {
    "entries": {
      "venice-router": {
        "enabled": true,
        "apiKey": "YOUR_VENICE_API_KEY"
      }
    }
  }
}
example.sh
python3 {baseDir}/scripts/venice-router.py --tier cheap --prompt "Tell me a joke"
python3 {baseDir}/scripts/venice-router.py --tier budget-medium --prompt "Write a Python function"
python3 {baseDir}/scripts/venice-router.py --tier mid --prompt "Explain quantum computing"
python3 {baseDir}/scripts/venice-router.py --tier premium --prompt "Write a distributed systems architecture"
example.sh
# Define tools in a JSON file (OpenAI tools format)
python3 {baseDir}/scripts/venice-router.py --tools tools.json --prompt "What's the weather in NYC?"
python3 {baseDir}/scripts/venice-router.py --tools tools.json --tool-choice auto --prompt "Search for latest AI news"
example.sh
# Show current spending
python3 {baseDir}/scripts/venice-router.py --budget-status

# Track per-session costs
python3 {baseDir}/scripts/venice-router.py --session-id my-project --prompt "help me code"

Tags

#web_and-frontend-development

Quick Info

Category Development
Model Claude 3.5
Complexity One-Click
Author plusone
Last Updated 3/10/2026
🚀
Optimized for
Claude 3.5
🧠

Ready to Install?

Get started with this skill in seconds

openclaw install venice-router