Pmc Harvest
Fetch articles from PubMed Central using NCBI APIs.
- Rating
- 4.7 (399 reviews)
- Downloads
- 2,418 downloads
- Version
- 1.0.0
Overview
Fetch articles from PubMed Central using NCBI APIs.
✨Key Features
E-utilities search — Find articles by journal, year, query
OAI-PMH full text — Retrieve complete article XML (open access only)
Batch harvesting — Process multiple journals at once
Abstract fetch — Lightweight retrieval for review queues
No API key required — Uses public NCBI APIs (rate-limited)
Complete Documentation
View Source →
PMC Harvest
Fetch full-text articles from PubMed Central using official NCBI APIs.
Features
- E-utilities search — Find articles by journal, year, query
- OAI-PMH full text — Retrieve complete article XML (open access only)
- Batch harvesting — Process multiple journals at once
- Abstract fetch — Lightweight retrieval for review queues
- No API key required — Uses public NCBI APIs (rate-limited)
Usage
# Search a journal
node {baseDir}/scripts/pmc-harvest.js --search "J Stroke[journal]" --year 2025
# Fetch full text for a specific article
node {baseDir}/scripts/pmc-harvest.js --fetch PMC12345678
# Batch harvest from multiple journals
node {baseDir}/scripts/pmc-harvest.js --harvest journals.json --year 2025
# Test with known journals
node {baseDir}/scripts/pmc-harvest.js --test
Options
| Flag | Description |
|---|---|
| --search | PMC search query (use journal[name] format) |
| --year | Filter by publication year |
| --max | Max results (default: 100) |
| --fetch | Fetch full text for specific PMCID |
| --harvest | Batch harvest from JSON journal list |
| --test | Run test with sample journals |
Programmatic API
const pmc = require('{baseDir}/lib/api.js');
// Search
const { count, pmcids } = await pmc.searchJournal('"J Stroke"[journal]', { year: 2025 });
// Get summaries
const summaries = await pmc.getSummaries(pmcids);
// Fetch full text
const { available, xml, reason } = await pmc.fetchFullText('PMC12345678');
// Parse JATS XML
const { title, abstract, body } = pmc.parseJATS(xml);
// Fetch abstract only (lightweight)
const { title, abstract } = await pmc.fetchAbstract('PMC12345678');
Journal Query Examples
const queries = {
'Stroke': '"Stroke"[journal]',
'Journal of Stroke': '"J Stroke"[journal]',
'Stroke & Vascular Neurology': '"Stroke Vasc Neurol"[journal]',
'European Stroke Journal': '"Eur Stroke J"[journal]',
'BMC Neurology': '"BMC Neurol"[journal]'
};
Limitations
- OAI-PMH only returns open-access articles — restricted content unavailable
- Rate limits — ~3 requests/second without API key
- Peak hours — NCBI recommends avoiding 5AM-9PM ET for large batches
API Reference
This skill wraps NCBI's official APIs:
- E-utilities:
https://eutils.ncbi.nlm.nih.gov/entrez/eutils esearch.fcgi— Search PMCesummary.fcgi— Get article metadata- OAI-PMH:
https://pmc.ncbi.nlm.nih.gov/api/oai/v1/mh GetRecord— Fetch full text XML
Installation
openclaw install pmc-harvest
💻Code Examples
node {baseDir}/scripts/pmc-harvest.js --test
## Options
| Flag | Description |
|------|-------------|
| `--search <query>` | PMC search query (use journal[name] format) |
| `--year <year>` | Filter by publication year |
| `--max <n>` | Max results (default: 100) |
| `--fetch <pmcid>` | Fetch full text for specific PMCID |
| `--harvest <file>` | Batch harvest from JSON journal list |
| `--test` | Run test with sample journals |
## Programmatic API# Search a journal
node {baseDir}/scripts/pmc-harvest.js --search "J Stroke[journal]" --year 2025
# Fetch full text for a specific article
node {baseDir}/scripts/pmc-harvest.js --fetch PMC12345678
# Batch harvest from multiple journals
node {baseDir}/scripts/pmc-harvest.js --harvest journals.json --year 2025
# Test with known journals
node {baseDir}/scripts/pmc-harvest.js --testconst pmc = require('{baseDir}/lib/api.js');
// Search
const { count, pmcids } = await pmc.searchJournal('"J Stroke"[journal]', { year: 2025 });
// Get summaries
const summaries = await pmc.getSummaries(pmcids);
// Fetch full text
const { available, xml, reason } = await pmc.fetchFullText('PMC12345678');
// Parse JATS XML
const { title, abstract, body } = pmc.parseJATS(xml);
// Fetch abstract only (lightweight)
const { title, abstract } = await pmc.fetchAbstract('PMC12345678');const queries = {
'Stroke': '"Stroke"[journal]',
'Journal of Stroke': '"J Stroke"[journal]',
'Stroke & Vascular Neurology': '"Stroke Vasc Neurol"[journal]',
'European Stroke Journal': '"Eur Stroke J"[journal]',
'BMC Neurology': '"BMC Neurol"[journal]'
};Tags
Quick Info
Ready to Install?
Get started with this skill in seconds
Related Skills
4claw
4claw — a moderated imageboard for AI agents.
Aap Passport
Agent Attestation Protocol - The Reverse Turing Test.
Acestep Lyrics Transcription
Transcribe audio to timestamped lyrics using OpenAI Whisper or ElevenLabs Scribe API.
Adaptive Suite
A continuously adaptive skill suite that empowers Clawdbot.