Dify Kb Search
Search Dify Knowledge Base (Dataset) to get accurate context for RAG-enhanced answers.
- Rating
- 5 (76 reviews)
- Downloads
- 7,706 downloads
- Version
- 1.0.0
Overview
Search Dify Knowledge Base (Dataset) to get accurate context for RAG-enhanced answers.
Complete Documentation
View Source →
Dify Knowledge Base Search Skill
🔍 Search your Dify Knowledge Base to get accurate, contextual answers
This skill enables AI agents to query Dify datasets for RAG (Retrieval-Augmented Generation) context retrieval. Perfect for knowledge base Q&A, documentation search, and contextual AI responses.
✨ Features
- List Knowledge Bases - Discover all available Dify datasets
- Smart Search - Query datasets with hybrid, semantic, or keyword search
- Auto-Discovery - Automatically find available datasets if ID not provided
- Configurable Results - Adjust top-k, search method, and reranking
- Error Handling - Graceful error messages for debugging
- Zero Hardcoding - All configuration via environment variables
🚀 Quick Start
1. Configure Environment Variables
Set up in openclaw.json:
{
"env": {
"vars": {
"DIFY_API_KEY": "${DIFY_API_KEY}",
"DIFY_BASE_URL": "https://dify.example.com/v1"
}
}
}
Environment Variables:
| Variable | Required | Default | Description |
|---|---|---|---|
| DIFY_API_KEY | ✅ Yes | - | Your Dify API Key (from Settings → API) |
| DIFY_BASE_URL | ❌ No | http://localhost/v1 | Your Dify instance base URL |
2. Install Dependencies
pip3 install requests
🛠️ Tools
dify_list
Lists all available knowledge bases (datasets) in your Dify instance.
Invocation: dify_list tool
Example Response:
{
"status": "success",
"count": 2,
"datasets": [
{
"id": "dataset-abc123",
"name": "Product Documentation",
"doc_count": 42,
"description": "All product guides and tutorials"
},
{
"id": "dataset-xyz789",
"name": "API Reference",
"doc_count": 156,
"description": "REST API documentation"
}
]
}
Usage:
{}
dify_search
Searches a Dify Dataset for relevant context chunks.
Invocation: dify_search tool (mapped to python3 scripts/search.py)
Parameters:
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| query | string | ✅ Yes | - | Search query or question |
| dataset_id | string | ❌ No | Auto-discover | Specific dataset ID to search |
| top_k | integer | ❌ No | 3 | Number of results to return |
| search_method | string | ❌ No | hybrid_search | Search strategy |
| reranking_enable | boolean | ❌ No | false | Enable reranking for better results |
hybrid_search- Combine semantic + keyword search (recommended)semantic_search- Meaning-based similarity searchkeyword_search- Exact keyword matching
{
"query": "How do I configure OpenClaw?",
"top_k": 5
}
{
"query": "API authentication methods",
"dataset_id": "dataset-xyz789",
"search_method": "semantic_search",
"reranking_enable": true
}
Example Response:
{
"status": "success",
"query": "How do I configure OpenClaw?",
"dataset_id": "dataset-abc123",
"count": 3,
"results": [
{
"content": "To configure OpenClaw, edit the openclaw.json file...",
"score": 0.8923,
"title": "Installation Guide",
"document_id": "doc-001"
},
{
"content": "OpenClaw supports environment variables via...",
"score": 0.8451,
"title": "Configuration Options",
"document_id": "doc-002"
}
]
}
📋 Complete Workflow Example
[
{
"tool": "dify_list",
"parameters": {}
},
{
"tool": "dify_search",
"parameters": {
"query": "What are the system requirements?",
"top_k": 5,
"search_method": "hybrid_search"
}
}
]
🔧 Troubleshooting
Common Errors
| Error | Solution |
|---|---|
| Missing DIFY_API_KEY | Set DIFY_API_KEY in environment variables |
| Connection refused | Check DIFY_BASE_URL is correct and accessible |
| No datasets found | Verify dataset exists in your Dify workspace |
| API request failed | Check network connectivity and API key permissions |
Debug Mode
Run manually to see detailed errors:
DIFY_API_KEY=your-key python3 scripts/search.py <<< '{"query":"test"}'
📚 Integration Tips
RAG Pipeline Integration
# Example: Use search results in AI response
results = dify_search(query, top_k=5)
context = "\n".join([r["content"] for r in results["results"]])
final_prompt = f"Answer based on context:\n\n{context}\n\nQuestion: {query}"
Multiple Datasets
For searching across multiple datasets, loop through them:
{
"query": "Find information about authentication",
"dataset_id": "dataset-api-docs"
}
Then query another dataset separately.
🔒 Security
- Never commit API keys - Use environment variables or
.envfiles - Rotate keys regularly - Generate new keys in Dify Settings
- Restrict access - Limit API key permissions where possible
📖 Implementation Details
This skill uses the Dify Dataset API:
- List Datasets:
GET /v1/datasets - Search:
POST /v1/datasets/{id}/retrieve
📝 Changelog
v1.1.0 (2026-02-08):
- ✅ Added search method selection (hybrid/semantic/keyword)
- ✅ Added reranking support
- ✅ Auto-discovery of datasets
- ✅ Improved error handling
- ✅ Removed hardcoded URLs (fully configurable)
- ✅ Added detailed logging
- Initial release
- Basic list and search functionality
Installation
openclaw install dify-kb-search
💻Code Examples
}
**Environment Variables:**
| Variable | Required | Default | Description |
|----------|----------|---------|-------------|
| `DIFY_API_KEY` | ✅ Yes | - | Your Dify API Key (from Settings → API) |
| `DIFY_BASE_URL` | ❌ No | `http://localhost/v1` | Your Dify instance base URL |
### 2. Install Dependenciespip3 install requests
## 🛠️ Tools
### dify_list
Lists all available knowledge bases (datasets) in your Dify instance.
**Invocation:** `dify_list` tool
**Example Response:**{}
### dify_search
Searches a Dify Dataset for relevant context chunks.
**Invocation:** `dify_search` tool (mapped to `python3 scripts/search.py`)
**Parameters:**
| Parameter | Type | Required | Default | Description |
|-----------|------|----------|---------|-------------|
| `query` | string | ✅ Yes | - | Search query or question |
| `dataset_id` | string | ❌ No | Auto-discover | Specific dataset ID to search |
| `top_k` | integer | ❌ No | 3 | Number of results to return |
| `search_method` | string | ❌ No | `hybrid_search` | Search strategy |
| `reranking_enable` | boolean | ❌ No | `false` | Enable reranking for better results |
**Search Methods:**
- `hybrid_search` - Combine semantic + keyword search (recommended)
- `semantic_search` - Meaning-based similarity search
- `keyword_search` - Exact keyword matching
**Example Usage:**]
## 🔧 Troubleshooting
### Common Errors
| Error | Solution |
|-------|----------|
| `Missing DIFY_API_KEY` | Set `DIFY_API_KEY` in environment variables |
| `Connection refused` | Check `DIFY_BASE_URL` is correct and accessible |
| `No datasets found` | Verify dataset exists in your Dify workspace |
| `API request failed` | Check network connectivity and API key permissions |
### Debug Mode
Run manually to see detailed errors:DIFY_API_KEY=your-key python3 scripts/search.py <<< '{"query":"test"}'
## 📚 Integration Tips
### RAG Pipeline Integrationfinal_prompt = f"Answer based on context:\n\n{context}\n\nQuestion: {query}"
### Multiple Datasets
For searching across multiple datasets, loop through them:{
"env": {
"vars": {
"DIFY_API_KEY": "${DIFY_API_KEY}",
"DIFY_BASE_URL": "https://dify.example.com/v1"
}
}
}{
"status": "success",
"count": 2,
"datasets": [
{
"id": "dataset-abc123",
"name": "Product Documentation",
"doc_count": 42,
"description": "All product guides and tutorials"
},
{
"id": "dataset-xyz789",
"name": "API Reference",
"doc_count": 156,
"description": "REST API documentation"
}
]
}{
"query": "How do I configure OpenClaw?",
"top_k": 5
}{
"query": "API authentication methods",
"dataset_id": "dataset-xyz789",
"search_method": "semantic_search",
"reranking_enable": true
}Tags
Quick Info
Ready to Install?
Get started with this skill in seconds
Related Skills
4claw
4claw — a moderated imageboard for AI agents.
Aap Passport
Agent Attestation Protocol - The Reverse Turing Test.
Adaptive Suite
A continuously adaptive skill suite that empowers Clawdbot.
Adversarial Prompting
Adversarial analysis to critique, fix.