✓ Verified 🌐 Web Scrapers ✓ Enhanced Data

Bybit Order Book

Download, process, and backtest ByBit derivatives historical order book data.

Rating
4.7 (92 reviews)
Downloads
7,565 downloads
Version
1.0.0

Overview

Download, process, and backtest ByBit derivatives historical order book data.

Complete Documentation

View Source →

ByBit Order Book Backtester

End-to-end pipeline: download → process → backtest → report.

Dependencies

bash
pip install undetected-chromedriver selenium pandas numpy pyarrow --break-system-packages

Chrome/Chromium must be installed for Selenium.

Workflow

The pipeline has 3 stages. Run them sequentially, or skip to later stages if data is already available.

Stage 1: Download Order Book Data

Prompt the user for:

  • Symbol (default: BTCUSDT)
  • Date range (default: last 30 days)
Run scripts/download_orderbook.py:

bash
python scripts/download_orderbook.py \
  --symbol BTCUSDT \
  --start 2024-06-01 --end 2024-06-30 \
  --output ./data/raw

Key details:

  • Downloads from https://www.bybit.com/derivatives/en/history-data
  • Automatically chunks into 7-day windows (ByBit's limit)
  • Uses undetected-chromedriver for Cloudflare bypass
  • Outputs: ZIP files in ./data/raw/ named {date}_{symbol}_ob500.data.zip
  • For data format details: see references/bybit_data_format.md
If Selenium fails (Cloudflare blocks, UI changes): Instruct the user to manually download from the ByBit page and place ZIPs in ./data/raw/.

Stage 2: Process & Filter to Depth 50

Run scripts/process_orderbook.py:

bash
python scripts/process_orderbook.py \
  --input ./data/raw \
  --output ./data/processed \
  --depth 50 \
  --sample-interval 1s

What it does:

  • Reads JSONL from ZIPs (each line = full 500-level L2 snapshot)
  • Filters to top 50 bid/ask levels
  • Computes derived features: mid_price, spread, volume_imbalance, microprice
  • Optionally downsamples (e.g., 1s, 5s, 1min) — recommended for faster backtests
  • Outputs: Parquet files in ./data/processed/
Without downsampling: ~860K snapshots/day, ~300 MB Parquet per day per symbol. With 1s downsampling: ~86K snapshots/day, ~5 MB per day — much more practical.

Stage 3: Backtest Strategies

Run scripts/backtest.py:

bash
# Run all 10 strategies
python scripts/backtest.py \
  --input ./data/processed/BTCUSDT_ob50.parquet \
  --output ./reports

# Run specific strategies
python scripts/backtest.py \
  --input ./data/processed/BTCUSDT_ob50.parquet \
  --strategies imbalance,breakout,market_making \
  --output ./reports

# Quick test with limited rows
python scripts/backtest.py \
  --input ./data/processed/BTCUSDT_ob50.parquet \
  --max-rows 100000 \
  --output ./reports

Strategy keys: imbalance, breakout, false_breakout, scalping, momentum, reversal, spoofing, optimal_execution, market_making, latency_arb

Outputs in ./reports/:

  • {SYMBOL}_backtest_report.json — Full results with equity curves
  • {SYMBOL}_backtest_report.md — Comparison table and detailed metrics
Report metrics per strategy: total trades, winners/losers, win rate, cumulative PnL, Sharpe ratio, max drawdown (absolute and %), avg PnL per trade, avg hold time, profit factor, best/worst trade, equity curve.

For strategy logic and tunable parameters: see references/strategies.md

Customization

To modify strategy parameters, edit the __init__ method of any strategy class in scripts/backtest.py. Each strategy's self.params dict contains all tunables.

To add a new strategy:

  • Subclass Strategy in scripts/backtest.py
  • Implement on_snapshot(self, row, idx, df) with entry/exit logic
  • Register in STRATEGY_MAP

Troubleshooting

Selenium can't load ByBit page: ByBit uses Cloudflare. Ensure undetected-chromedriver is up to date. Try --no-headless to debug visually. Fall back to manual download.

Out of memory on processing: Use --sample-interval 1s or larger. Process one day at a time.

No trades generated: Strategy thresholds may be too tight for the data period. Relax parameters (lower thresholds, shorter lookbacks) in references/strategies.md.

Installation

Terminal bash

openclaw install bybit-order-book
    
Copied!

💻Code Examples

pip install undetected-chromedriver selenium pandas numpy pyarrow --break-system-packages

pip-install-undetected-chromedriver-selenium-pandas-numpy-pyarrow---break-system-packages.txt
Chrome/Chromium must be installed for Selenium.

## Workflow

The pipeline has 3 stages. Run them sequentially, or skip to later stages if data is already available.

### Stage 1: Download Order Book Data

Prompt the user for:
- **Symbol** (default: BTCUSDT)
- **Date range** (default: last 30 days)

Run `scripts/download_orderbook.py`:

--output ./data/raw

---output-dataraw.txt
Key details:
- Downloads from `https://www.bybit.com/derivatives/en/history-data`
- Automatically chunks into 7-day windows (ByBit's limit)
- Uses `undetected-chromedriver` for Cloudflare bypass
- Outputs: ZIP files in `./data/raw/` named `{date}_{symbol}_ob500.data.zip`
- For data format details: see `references/bybit_data_format.md`

**If Selenium fails** (Cloudflare blocks, UI changes): Instruct the user to manually download from the ByBit page and place ZIPs in `./data/raw/`.

### Stage 2: Process & Filter to Depth 50

Run `scripts/process_orderbook.py`:

--sample-interval 1s

---sample-interval-1s.txt
What it does:
- Reads JSONL from ZIPs (each line = full 500-level L2 snapshot)
- Filters to top 50 bid/ask levels
- Computes derived features: mid_price, spread, volume_imbalance, microprice
- Optionally downsamples (e.g., `1s`, `5s`, `1min`) — recommended for faster backtests
- Outputs: Parquet files in `./data/processed/`

**Without downsampling**: ~860K snapshots/day, ~300 MB Parquet per day per symbol.
**With 1s downsampling**: ~86K snapshots/day, ~5 MB per day — much more practical.

### Stage 3: Backtest Strategies

Run `scripts/backtest.py`:
example.sh
python scripts/download_orderbook.py \
  --symbol BTCUSDT \
  --start 2024-06-01 --end 2024-06-30 \
  --output ./data/raw
example.sh
python scripts/process_orderbook.py \
  --input ./data/raw \
  --output ./data/processed \
  --depth 50 \
  --sample-interval 1s
example.sh
# Run all 10 strategies
python scripts/backtest.py \
  --input ./data/processed/BTCUSDT_ob50.parquet \
  --output ./reports

# Run specific strategies
python scripts/backtest.py \
  --input ./data/processed/BTCUSDT_ob50.parquet \
  --strategies imbalance,breakout,market_making \
  --output ./reports

# Quick test with limited rows
python scripts/backtest.py \
  --input ./data/processed/BTCUSDT_ob50.parquet \
  --max-rows 100000 \
  --output ./reports

Tags

#browser_and-automation #data

Quick Info

Category Web Scrapers
Model Claude 3.5
Complexity One-Click
Author davidm413
Last Updated 3/10/2026
🚀
Optimized for
Claude 3.5
🧠

Ready to Install?

Get started with this skill in seconds

openclaw install bybit-order-book