✓ Verified 💻 Development ✓ Enhanced Data

Curated Search

Domain-restricted full-text search over curated technical documentation.

Rating
4.7 (404 reviews)
Downloads
12,562 downloads
Version
1.0.0

Overview

Domain-restricted full-text search over curated technical documentation.

Complete Documentation

View Source →

Curated Search Skill

Summary

Domain-restricted full-text search over a curated whitelist of technical documentation (MDN, Python docs, etc.). Provides clean, authoritative results without web spam.

External Endpoints

This skill does not call any external network endpoints during search operations. The crawler optionally makes outbound HTTP requests during index builds (one‑time setup), but those are user‑initiated (npm run crawl) and respect the configured domain whitelist.

Security & Privacy

  • Search is fully local – After the index is built, all queries run offline; no data leaves your machine.
  • Crawling is optional and whitelist‑scoped – The crawler only accesses domains you explicitly list in config.yaml. It respects robots.txt and configurable delays.
  • No telemetry – No usage data is transmitted externally.
  • Configuration is read from local config.yaml and the index file in data/.

Model Invocation Note

The curated-search.search tool is invoked only when the user explicitly calls it. It does not run autonomously. OpenClaw calls the tool handler (scripts/search.js) when the user asks to search the curated index.

Trust Statement

By using this skill, you trust that the code operates locally and only crawls domains you approve. The skill does not send your queries or workspace data to any third party. Review the open‑source implementation before installing.


Tool: curated-search.search

Search the curated index.

Parameters

NameTypeRequiredDefaultDescription
querystringyesSearch query terms
limitnumberno5Maximum results (capped by config.max_limit, typically 100)
domainstringnonullFilter to specific domain (e.g., docs.python.org)
min_scorenumberno0.0Minimum relevance score (0.0–1.0); filters out low-quality matches
offsetnumberno0Pagination offset (skip first N results)

Response

JSON array of result objects:

json
[
  {
    "title": "Python Tutorial",
    "url": "https://docs.python.org/3/tutorial/",
    "snippet": "Python is an easy to learn, powerful programming language...",
    "domain": "docs.python.org",
    "score": 0.87,
    "crawled_at": 1707712345678
  }
]

Fields:

  • title — Document title (cleaned)
  • url — Source URL (canonical)
  • snippet — Excerpt (~200 chars) from content
  • domain — Hostname of source
  • score — BM25 relevance score (higher is better; not normalized 0–1 but typically 0–1 range)
  • crawled_at — Unix timestamp when page was crawled

Example Agent Calls

text
search CuratedSearch for "python tutorial"
search CuratedSearch for "async await" limit=3 domain=developer.mozilla.org
search CuratedSearch for "linux man page" min_score=0.3

Errors

If an error occurs, the tool exits non-zero and prints a JSON error object to stderr, e.g.:

json
{
  "error": "index_not_found",
  "message": "Search index not found. The index has not been built yet.",
  "suggestion": "Run the crawler first: npm run crawl",
  "details": { "path": "data/index.json" }
}

Common error codes:

CodeMeaningSuggested Fix
config_missingConfiguration file not foundSpecify --config path or ensure config.yaml exists
config_invalidYAML parsing failedCheck syntax in config.yaml
config_missing_index_pathindex.path not setAdd index.path to config
index_not_foundIndex file missingRun npm run crawl to build index
index_corruptedIndex file incompatible or corruptedRebuild index with npm run crawl
index_init_failedUnexpected index initialization errorCheck permissions, reinstall dependencies
missing_queryNo query providedProvide --query argument
query_too_longQuery exceeds 1000 charactersShorten the query
limit_exceededLimit > config.max_limitUse a smaller limit
invalid_domainDomain filter malformedUse format like docs.python.org
conflicting_flagsMutually exclusive flags used (e.g., --stats with --query)Use flags correctly
stats_failedCould not retrieve index statsEnsure index is accessible
search_failedSearch execution threw an errorCheck query and index integrity

Configuration

Edit config.yaml in the skill directory. Key sections:

  • domains — whitelist of allowed domains (required)
  • seeds — starting URLs for crawling
  • crawl — depth, delay, timeout, max_documents
  • content — min_content_length, max_content_length
  • index — path to index files
  • search — default_limit, max_limit, min_score
See README.md for full configuration docs.

Support

  • Full documentation: README.md
  • Technical specs: specs/
  • Build plan: PLAN.md
  • Contributor guide: CONTRIBUTING.md
  • Issues: Report on GitHub (or via OpenClaw maintainers)

Installation

Terminal bash

openclaw install curated-search
    
Copied!

💻Code Examples

]

.txt
**Fields:**
- `title` — Document title (cleaned)
- `url` — Source URL (canonical)
- `snippet` — Excerpt (~200 chars) from content
- `domain` — Hostname of source
- `score` — BM25 relevance score (higher is better; not normalized 0–1 but typically 0–1 range)
- `crawled_at` — Unix timestamp when page was crawled

### Example Agent Calls

search CuratedSearch for "linux man page" min_score=0.3

search-curatedsearch-for-linux-man-page-minscore03.txt
### Errors

If an error occurs, the tool exits non-zero and prints a JSON error object to stderr, e.g.:
example.json
[
  {
    "title": "Python Tutorial",
    "url": "https://docs.python.org/3/tutorial/",
    "snippet": "Python is an easy to learn, powerful programming language...",
    "domain": "docs.python.org",
    "score": 0.87,
    "crawled_at": 1707712345678
  }
]
example.txt
search CuratedSearch for "python tutorial"
search CuratedSearch for "async await" limit=3 domain=developer.mozilla.org
search CuratedSearch for "linux man page" min_score=0.3
example.json
{
  "error": "index_not_found",
  "message": "Search index not found. The index has not been built yet.",
  "suggestion": "Run the crawler first: npm run crawl",
  "details": { "path": "data/index.json" }
}

Tags

#web_and-frontend-development

Quick Info

Category Development
Model Claude 3.5
Complexity One-Click
Author qsmtco
Last Updated 3/10/2026
🚀
Optimized for
Claude 3.5
🧠

Ready to Install?

Get started with this skill in seconds

openclaw install curated-search