Back to skills
SkillHub ClubShip Full StackFull Stack
ocr
Extract text from images using Tesseract OCR
Packaged view
This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.
Stars
1,021
Hot score
99
Updated
March 20, 2026
Overall rating
C6.1
Composite score
6.1
Best-practice grade
B81.2
Install command
npx @skill-hub/cli install trpc-group-trpc-agent-go-ocr
Repository
trpc-group/trpc-agent-go
Skill path: examples/skill/skills/ocr
Extract text from images using Tesseract OCR
Open repositoryBest for
Primary workflow: Ship Full Stack.
Technical facets: Full Stack.
Target audience: everyone.
License: Unknown.
Original source
Catalog source: SkillHub Club.
Repository owner: trpc-group.
This is still a mirrored public skill entry. Review the repository before installing into production workflows.
What it helps with
- Install ocr into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
- Review https://github.com/trpc-group/trpc-agent-go before adding ocr to shared team environments
- Use ocr for development workflows
Works across
Claude CodeCodex CLIGemini CLIOpenCode
Favorites: 0.
Sub-skills: 0.
Aggregator: No.
Original source / Raw SKILL.md
---
name: ocr
description: Extract text from images using Tesseract OCR
---
# OCR Image Text Extraction Skill
Extract text from images using Tesseract OCR engine.
## Capabilities
- Extract text from image files (PNG, JPG, JPEG, GIF, BMP, TIFF)
- Support for 100+ languages
- Optional image preprocessing for better accuracy
- Output in plain text or JSON format with confidence scores
## Usage
### Basic OCR
```bash
python3 scripts/ocr.py <image_file> <output_file>
```
### With Options
```bash
# Specify language (default: eng)
python3 scripts/ocr.py image.png text.txt --lang eng
# Chinese text
python3 scripts/ocr.py image.png text.txt --lang chi_sim
# Multiple languages
python3 scripts/ocr.py image.png text.txt --lang eng+chi_sim
# With image preprocessing (improves accuracy)
python3 scripts/ocr.py image.png text.txt --preprocess
# JSON output with confidence scores
python3 scripts/ocr.py image.png output.json --format json
```
### Download and OCR from URL
```bash
# OCR from remote image
python3 scripts/ocr_url.py <image_url> <output_file>
# With options
python3 scripts/ocr_url.py https://example.com/image.jpg text.txt --lang eng --preprocess
```
## Parameters
- `image_file` / `image_url` (required): Path to local image or image URL
- `output_file` (required): Path to output text/JSON file
- `--lang`: Language code (e.g., eng, chi_sim, jpn, fra, deu). Default: eng
- `--preprocess`: Apply image preprocessing (grayscale, thresholding) for better accuracy
- `--format`: Output format (text/json, default: text)
## Common Languages
| Language | Code |
|----------|------|
| English | eng |
| Chinese (Simplified) | chi_sim |
| Chinese (Traditional) | chi_tra |
| Japanese | jpn |
| Korean | kor |
| French | fra |
| German | deu |
| Spanish | spa |
| Russian | rus |
| Arabic | ara |
## Supported Image Formats
PNG, JPG, JPEG, GIF, BMP, TIFF, WEBP
## Dependencies
- Python 3.8+
- pytesseract
- Pillow (PIL)
- tesseract-ocr (system package)
## Installation
```bash
# Python packages
pip install pytesseract Pillow
# Tesseract OCR engine
sudo apt-get install tesseract-ocr # Ubuntu/Debian
sudo yum install tesseract # CentOS/RHEL
brew install tesseract # macOS
```
---
## Referenced Files
> The following files are referenced in this skill and included for context.
### scripts/ocr.py
```python
#!/usr/bin/env python3
"""
OCR Script for Image Text Extraction
Extract text from images using Tesseract OCR
"""
import argparse
import json
import sys
import os
try:
import pytesseract
from PIL import Image, ImageEnhance, ImageFilter
except ImportError as e:
print(f"Error: Required package not installed: {e}", file=sys.stderr)
print("Run: pip install pytesseract Pillow", file=sys.stderr)
sys.exit(1)
def preprocess_image(image):
"""Apply preprocessing to improve OCR accuracy"""
# Convert to grayscale
image = image.convert('L')
# Enhance contrast
enhancer = ImageEnhance.Contrast(image)
image = enhancer.enhance(2.0)
# Apply sharpening
image = image.filter(ImageFilter.SHARPEN)
return image
def ocr_image(image_path, output_path, lang="eng", preprocess=False, output_format="text"):
"""Extract text from image using OCR"""
# Validate input file
if not os.path.exists(image_path):
print(f"Error: Image file not found: {image_path}", file=sys.stderr)
sys.exit(1)
try:
# Load image
print(f"Loading image: {image_path}...", file=sys.stderr)
image = Image.open(image_path)
# Preprocess if requested
if preprocess:
print("Applying image preprocessing...", file=sys.stderr)
image = preprocess_image(image)
# Perform OCR
print(f"Extracting text (language: {lang})...", file=sys.stderr)
if output_format == "json":
# Get detailed data with confidence scores
data = pytesseract.image_to_data(image, lang=lang, output_type=pytesseract.Output.DICT)
# Extract text and confidence
text_parts = []
for i, text in enumerate(data['text']):
if text.strip():
text_parts.append(text)
full_text = " ".join(text_parts)
avg_conf = sum(c for c in data['conf'] if c != -1) / max(len([c for c in data['conf'] if c != -1]), 1)
output_data = {
"text": full_text.strip(),
"language": lang,
"confidence": round(avg_conf, 2),
"image_path": image_path
}
with open(output_path, "w", encoding="utf-8") as f:
json.dump(output_data, f, ensure_ascii=False, indent=2)
else:
# Plain text output
text = pytesseract.image_to_string(image, lang=lang)
with open(output_path, "w", encoding="utf-8") as f:
f.write(text.strip())
print(f"✓ Text extracted successfully", file=sys.stderr)
print(f" Output saved to: {output_path}", file=sys.stderr)
except Exception as e:
print(f"Error during OCR processing: {e}", file=sys.stderr)
sys.exit(1)
def main():
parser = argparse.ArgumentParser(description="Extract text from images using OCR")
parser.add_argument("image_file", help="Input image file path")
parser.add_argument("output_file", help="Output text/JSON file path")
parser.add_argument("--lang", default="eng",
help="Language code (e.g., eng, chi_sim, jpn). Default: eng")
parser.add_argument("--preprocess", action="store_true",
help="Apply image preprocessing for better accuracy")
parser.add_argument("--format", default="text", choices=["text", "json"],
help="Output format (default: text)")
args = parser.parse_args()
ocr_image(
args.image_file,
args.output_file,
lang=args.lang,
preprocess=args.preprocess,
output_format=args.format
)
if __name__ == "__main__":
main()
```
### scripts/ocr_url.py
```python
#!/usr/bin/env python3
"""
OCR Script for Remote Images
Download and extract text from images hosted on the web
"""
import argparse
import sys
import os
import tempfile
try:
import pytesseract
from PIL import Image, ImageEnhance, ImageFilter
import requests
except ImportError as e:
print(f"Error: Required package not installed: {e}", file=sys.stderr)
print("Run: pip install pytesseract Pillow requests", file=sys.stderr)
sys.exit(1)
def preprocess_image(image):
"""Apply preprocessing to improve OCR accuracy"""
image = image.convert('L')
enhancer = ImageEnhance.Contrast(image)
image = enhancer.enhance(2.0)
image = image.filter(ImageFilter.SHARPEN)
return image
def ocr_from_url(image_url, output_path, lang="eng", preprocess=False, output_format="text"):
"""Download image from URL and extract text using OCR"""
try:
# Download image
print(f"Downloading image from: {image_url}...", file=sys.stderr)
response = requests.get(image_url, timeout=30, stream=True)
response.raise_for_status()
# Save to temporary file
with tempfile.NamedTemporaryFile(delete=False, suffix=".img") as tmp_file:
for chunk in response.iter_content(chunk_size=8192):
tmp_file.write(chunk)
tmp_path = tmp_file.name
print(f"Image downloaded to: {tmp_path}", file=sys.stderr)
# Load image
image = Image.open(tmp_path)
# Preprocess if requested
if preprocess:
print("Applying image preprocessing...", file=sys.stderr)
image = preprocess_image(image)
# Perform OCR
print(f"Extracting text (language: {lang})...", file=sys.stderr)
text = pytesseract.image_to_string(image, lang=lang)
# Write output
with open(output_path, "w", encoding="utf-8") as f:
f.write(text.strip())
# Clean up temp file
os.unlink(tmp_path)
print(f"✓ Text extracted successfully", file=sys.stderr)
print(f" Output saved to: {output_path}", file=sys.stderr)
except requests.RequestException as e:
print(f"Error downloading image: {e}", file=sys.stderr)
sys.exit(1)
except Exception as e:
print(f"Error during OCR processing: {e}", file=sys.stderr)
sys.exit(1)
def main():
parser = argparse.ArgumentParser(description="Extract text from remote images using OCR")
parser.add_argument("image_url", help="Image URL")
parser.add_argument("output_file", help="Output text file path")
parser.add_argument("--lang", default="eng",
help="Language code (e.g., eng, chi_sim, jpn). Default: eng")
parser.add_argument("--preprocess", action="store_true",
help="Apply image preprocessing for better accuracy")
parser.add_argument("--format", default="text", choices=["text", "json"],
help="Output format (default: text)")
args = parser.parse_args()
ocr_from_url(
args.image_url,
args.output_file,
lang=args.lang,
preprocess=args.preprocess,
output_format=args.format
)
if __name__ == "__main__":
main()
```