✓ Verified 💻 Development ✓ Enhanced Data

Desearch Crawl

Crawl/scrape and extract content from any webpage URL.

Rating: 4.4 (136 reviews)
Downloads: 1,435 downloads
Version: 1.0.0

Overview

Crawl/scrape and extract content from any webpage URL.

Complete Documentation

View Source →

Crawl Webpage By Desearch

Extract content from any webpage URL. Returns clean text or raw HTML.

Quick Start

Get an API key from https://console.desearch.ai
Set environment variable: export DESEARCH_API_KEY='your-key-here'

Usage

bash

# Crawl a webpage (returns clean text by default)
scripts/desearch.py crawl "https://en.wikipedia.org/wiki/Artificial_intelligence"

# Get raw HTML
scripts/desearch.py crawl "https://example.com" --crawl-format html

Options

Option	Description
--crawl-format	Output content format: text (default) or html

Examples

Read a documentation page

bash

scripts/desearch.py crawl "https://docs.python.org/3/tutorial/index.html"

Get raw HTML for analysis

bash

scripts/desearch.py crawl "https://example.com/page" --crawl-format html

Response

Example (`format=text`, truncated, default)

text

Artificial intelligence (AI) is the capability of computational systems to perform tasks that typically require human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making...

Example (`format=html`, truncated)

html

<!DOCTYPE html>
<html>
  <head><title>Artificial intelligence - Wikipedia</title></head>
  <body>
    <p>Artificial intelligence (AI) is the capability of computational systems...</p>
  </body>
</html>

Notes

Response is plain text or raw HTML — not JSON.
Default format is text. Use --crawl-format html only when you need to inspect page structure.
Prefer text format to avoid bloating the agent context with markup.

Errors

Status 401, Unauthorized (e.g., missing/invalid API key)

json

{
  "detail": "Invalid or missing API key"
}

Status 402, Payment Required (e.g., balance depleted)

json

{
  "detail": "Insufficient balance, please add funds to your account to continue using the service."
}

Resources

Installation

Terminal bash


openclaw install desearch-crawl

Copied!

💻Code Examples

scripts/desearch.py crawl "https://example.com" --crawl-format html

scriptsdesearchpy-crawl-httpsexamplecom---crawl-format-html.txt

## Options

| Option | Description |
|--------|-------------|
| `--crawl-format` | Output content format: `text` (default) or `html` |

## Examples

### Read a documentation page

scripts/desearch.py crawl "https://example.com/page" --crawl-format html

scriptsdesearchpy-crawl-httpsexamplecompage---crawl-format-html.txt

## Response

### Example (`format=text`, truncated, default)

</html>

html.txt

### Notes
- Response is plain text or raw HTML — not JSON.
- Default format is `text`. Use `--crawl-format html` only when you need to inspect page structure.
- Prefer `text` format to avoid bloating the agent context with markup.

### Errors
Status 401, Unauthorized (e.g., missing/invalid API key)

example.sh

# Crawl a webpage (returns clean text by default)
scripts/desearch.py crawl "https://en.wikipedia.org/wiki/Artificial_intelligence"

# Get raw HTML
scripts/desearch.py crawl "https://example.com" --crawl-format html

example.html

<!DOCTYPE html>
<html>
  <head><title>Artificial intelligence - Wikipedia</title></head>
  <body>
    <p>Artificial intelligence (AI) is the capability of computational systems...</p>
  </body>
</html>

example.json

{
  "detail": "Invalid or missing API key"
}

example.json

{
  "detail": "Insufficient balance, please add funds to your account to continue using the service."
}

Related Skills

✓ Verified 💻 Development

4claw

4claw — a moderated imageboard for AI agents.

🧠 Claude-Ready #ai_and-llms

✓ Verified 💻 Development

Aap Passport

Agent Attestation Protocol - The Reverse Turing Test.

🧠 Claude-Ready #ai_and-llms

✓ Verified 💻 Development

Acestep Lyrics Transcription

Transcribe audio to timestamped lyrics using OpenAI Whisper or ElevenLabs Scribe API.

⚡ GPT-Optimized #ai_and-llms #api #script

✓ Verified 💻 Development

Adaptive Suite

A continuously adaptive skill suite that empowers Clawdbot.

🧠 Claude-Ready #ai_and-llms #bot

Desearch Crawl

Overview

Complete Documentation

Crawl Webpage By Desearch

Quick Start

Usage

Options

Examples

Read a documentation page

Get raw HTML for analysis

Response

Example (`format=text`, truncated, default)

Example (`format=html`, truncated)

Notes

Errors

Resources

Installation

💻Code Examples

scripts/desearch.py crawl "https://example.com" --crawl-format html

scripts/desearch.py crawl "https://example.com/page" --crawl-format html

</html>

Tags

Quick Info

Ready to Install?

Resources

Related Skills

4claw

Aap Passport

Acestep Lyrics Transcription

Adaptive Suite

Overview

Complete Documentation

Crawl Webpage By Desearch

Quick Start

Usage

Options

Examples

Read a documentation page

Get raw HTML for analysis

Response

Example (format=text, truncated, default)

Example (format=html, truncated)

Notes

Errors

Resources

Installation

💻Code Examples

scripts/desearch.py crawl "https://example.com" --crawl-format html

scripts/desearch.py crawl "https://example.com/page" --crawl-format html

</html>

Tags

Quick Info

Ready to Install?

Resources

Related Skills

4claw

Aap Passport

Acestep Lyrics Transcription

Adaptive Suite

Example (`format=text`, truncated, default)

Example (`format=html`, truncated)