Tesseract Ocr
Extract text from images using the Tesseract OCR engine directly via command line.
- Rating
- 4.2 (43 reviews)
- Downloads
- 19,927 downloads
- Version
- 1.0.0
Overview
Extract text from images using the Tesseract OCR engine directly via command line.
✨Key Features
Extract text from image files using native tesseract CLI
Support multi-language recognition (Chinese, English, etc.)
No Python dependencies required
Simple and fast
Complete Documentation
View Source →
Tesseract OCR Skill
Extract text content from images using the Tesseract engine directly via command line.
Features
- Extract text from image files using native tesseract CLI
- Support multi-language recognition (Chinese, English, etc.)
- No Python dependencies required
- Simple and fast
Dependencies
Install Tesseract OCR system package:
# Ubuntu/Debian:
sudo apt-get install tesseract-ocr tesseract-ocr-chi-sim
# macOS:
brew install tesseract tesseract-lang
Usage
Basic Usage
# Use default language (English)
tesseract /path/to/image.png stdout
# Specify language (Chinese + English)
tesseract /path/to/image.png stdout -l chi_sim+eng
# Save to file
tesseract /path/to/image.png output.txt -l chi_sim+eng
# Multiple languages
tesseract /path/to/image.png stdout -l chi_sim+eng+jpn
Common Language Codes
| Language | Code |
|---|---|
| Simplified Chinese | chi_sim |
| Traditional Chinese | chi_tra |
| English | eng |
| Japanese | jpn |
| Korean | kor |
| Chinese + English | chi_sim+eng |
Quick Examples
# OCR with Chinese support
tesseract image.jpg stdout -l chi_sim
# OCR with mixed Chinese and English
tesseract image.png stdout -l chi_sim+eng
# Save to file instead of stdout
tesseract document.png result -l chi_sim+eng
# Creates result.txt
Notes
- OCR accuracy depends on image quality; use clear images for best results
- Complex layouts (tables, multi-column) may require post-processing
- Chinese recognition requires the tesseract-ocr-chi-sim language pack
- Language packs must be installed separately on your system
Installation
openclaw install tesseract-ocr
💻Code Examples
brew install tesseract tesseract-lang
## Usage
### Basic Usagetesseract /path/to/image.png stdout -l chi_sim+eng+jpn
### Common Language Codes
| Language | Code |
|----------|------|
| Simplified Chinese | chi_sim |
| Traditional Chinese | chi_tra |
| English | eng |
| Japanese | jpn |
| Korean | kor |
| Chinese + English | chi_sim+eng |
### Quick Examples# Ubuntu/Debian:
sudo apt-get install tesseract-ocr tesseract-ocr-chi-sim
# macOS:
brew install tesseract tesseract-lang# Use default language (English)
tesseract /path/to/image.png stdout
# Specify language (Chinese + English)
tesseract /path/to/image.png stdout -l chi_sim+eng
# Save to file
tesseract /path/to/image.png output.txt -l chi_sim+eng
# Multiple languages
tesseract /path/to/image.png stdout -l chi_sim+eng+jpn# OCR with Chinese support
tesseract image.jpg stdout -l chi_sim
# OCR with mixed Chinese and English
tesseract image.png stdout -l chi_sim+eng
# Save to file instead of stdout
tesseract document.png result -l chi_sim+eng
# Creates result.txtTags
Quick Info
Ready to Install?
Get started with this skill in seconds
Related Skills
4claw
4claw — a moderated imageboard for AI agents.
Aap Passport
Agent Attestation Protocol - The Reverse Turing Test.
Adaptive Suite
A continuously adaptive skill suite that empowers Clawdbot.
Adversarial Prompting
Adversarial analysis to critique, fix.