✓ Verified
🌐 Web Scrapers
✓ Enhanced Data
Arxiv Summarizer Orchestrator
End-to-end orchestration skill for periodic arXiv collection and reporting using three sub-skills.
- Rating
- 4.5 (345 reviews)
- Downloads
- 2,331 downloads
- Version
- 1.0.0
Overview
End-to-end orchestration skill for periodic arXiv collection and reporting using three sub-skills.
Complete Documentation
View Source →name: arxiv-summarizer-orchestrator description: "End-to-end orchestration skill for periodic arXiv collection and reporting using three sub-skills: arxiv-search-collector, arxiv-paper-processor, and arxiv-batch-reporter. Supports manual language control across all markdown outputs and Stage-B processing strategy (
subagent_parallel default max 5, or serial)."
ArXiv Summarizer Orchestrator
Run the full pipeline by composing three sub-skills.Sub-skill Order
arxiv-search-collector
arxiv-paper-processor
arxiv-batch-reporter
Workflow Parameters
language: manual language parameter used by all stages. Default is English when omitted.
paper_processing_mode:subagent_parallelorserial.
max_parallel_papers: default5whenpaper_processing_mode=subagent_parallel.
Workflow
Stage A: Collection Setup + Query Retrieval
- Initialize one run with
arxiv-search-collector/scripts/init_collection_run.py.
- Model generates multiple focused queries from original topic and writes a minimal
query_plan.json(label+queryonly).
- Run
arxiv-search-collector/scripts/fetch_queries_batch.pywith the plan file (recommended).
- (Optional fallback) call
arxiv-search-collector/scripts/fetch_query_metadata.pymanually for one-by-one fetch.
- Model reads each indexed query list and decides keep indexes.
- Merge selected items with
arxiv-search-collector/scripts/merge_selected_papers.py.
- If relevance/coverage is still not good, iterate Stage A:
- generate another query plan with new labels,
- fetch again,
- re-merge with
--incrementaland updatedselection-json.
- set weak labels to empty keep list (
[]) to explicitly drop them.
--language to collector scripts so all generated markdown files in Stage A follow the selected language.
Use serial query fetch in Stage A with conservative controls (for example --min-interval-sec 5, --retry-max 4).
Default collector settings already include retries/backoff and run-local throttle state (/.runtime/arxiv_api_state.json ), so manual tuning is usually unnecessary.
Prefer cache reuse (no --force) unless query parameters changed or data refresh is required.
Output: one run directory with per-paper metadata subdirectories.
Stage B: Per-paper Artifact Download + Manual Summary
For each paper directory, invoke sub-skillarxiv-paper-processor once and let that skill produce /summary.md .
Recommended pre-step for many papers:
- Run one batch artifact download before per-paper reading:
bash
python3 arxiv-paper-processor/scripts/download_papers_batch.py \
--run-dir /path/to/run \
--artifact source_then_pdf \
--max-workers 3 \
--min-interval-sec 5 \
--language
`
Per-paper execution steps (inside arxiv-paper-processor):
- If
already exists and is complete, skip this paper.
- If usable source (
source/source_extract/*.tex) or PDF (source/paper.pdf) already exists, skip download.
- If artifacts are missing, download source with
arxiv-paper-processor/scripts/download_arxiv_source.py.
- If source is unusable, download PDF with
arxiv-paper-processor/scripts/download_arxiv_pdf.py.
- Model reads content and manually writes
by reference format, in language.
Parallel strategy for many papers:
- Default:
paper_processing_mode=subagent_parallel with max_parallel_papers=5.
- Optional:
paper_processing_mode=serial to process one paper at a time.
- In parallel mode, run multiple
arxiv-paper-processor instances in batches; concurrent papers must not exceed max_parallel_papers.
- Wait for one batch to finish before starting the next batch.
- In serial mode, run exactly one
arxiv-paper-processor instance at a time.
- Subagent workers should only own one paper directory each to avoid file conflicts.
- Do not use scripts to auto-compose summary text; scripts are download-only tools.
Output: all paper directories contain summary.md.
Stage C: Bundle + Final Hierarchical Report
- Run
arxiv-batch-reporter/scripts/collect_summaries_bundle.py --language .
- Model reads
summaries_bundle.md and writes collection_report_template.md in base dir.
- In template, each paper leaf entry must include one standalone placeholder line:
{{ARXIV_BRIEF:.
- Run
arxiv-batch-reporter/scripts/render_collection_report.py to generate final collection_report.md.
- Do not manually paraphrase per-paper conclusion lines in final report; they must come from per-paper
summary.md section 10 via script injection.
If language is non-English (for example Chinese), all intermediate markdown files and final reports should follow that language.
Periodic Scheduling
This orchestrator is suitable for cron/scheduled execution in OpenClaw:
- Frequency examples: daily, weekly, monthly.
- For rolling windows, use lookback (
1d, 7d, 30d) when initializing runs.
Output Layout
task_meta.json, task_meta.md
query_results/, query_selection/
+ downloaded source/pdf + summary.md
summaries_bundle.md
collection_report_template.md
- final rendered collection report (e.g.
collection_report.md)
Use references/workflow-checklist.md as execution checklist.
Related Skills
This is the top-level orchestration skill.
Before using it, install and enable these three sub-skills:
arxiv-search-collector
arxiv-paper-processor
arxiv-batch-reporter
Execution order inside this orchestrator:
arxiv-search-collector (Stage A)
arxiv-paper-processor (Stage B)
arxiv-batch-reporter` (Stage C)
Installation
Terminal bash
openclaw install arxiv-summarizer-orchestrator
Copied!
Tags
#search_and-research
Quick Info
Category Web Scrapers
Model Claude 3.5
Complexity Advanced
Author xukp20
Last Updated 3/10/2026
🚀
Optimized for
Claude 3.5
Ready to Install?
Get started with this skill in seconds
openclaw install arxiv-summarizer-orchestrator
Related Skills
✓ Verified
💻 Development
4claw
4claw — a moderated imageboard for AI agents.
🧠 Claude-Ready
)}
★ 4.4 (118)
↓ 4,990
v1.0.0
✓ Verified
💻 Development
Aap Passport
Agent Attestation Protocol - The Reverse Turing Test.
🧠 Claude-Ready
)}
★ 4.3 (89)
↓ 4,621
v1.0.0
✓ Verified
💻 Development
Adaptive Suite
A continuously adaptive skill suite that empowers Clawdbot.
🧠 Claude-Ready
)}
★ 4.7 (88)
↓ 1,625
v1.0.0
✓ Verified
💻 Development
Adversarial Prompting
Adversarial analysis to critique, fix.
🧠 Claude-Ready
)}
★ 4.6 (372)
↓ 28,222
v1.0.0