local-whisper
Local speech-to-text using OpenAI Whisper. Runs fully offline after model download. High quality transcription with multiple model sizes.
Packaged view
This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.
Install command
npx @skill-hub/cli install openclaw-skills-local-whisper
Repository
Skill path: skills/araa47/local-whisper
Local speech-to-text using OpenAI Whisper. Runs fully offline after model download. High quality transcription with multiple model sizes.
Open repositoryBest for
Primary workflow: Ship Full Stack.
Technical facets: Full Stack.
Target audience: everyone.
License: Unknown.
Original source
Catalog source: SkillHub Club.
Repository owner: openclaw.
This is still a mirrored public skill entry. Review the repository before installing into production workflows.
What it helps with
- Install local-whisper into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
- Review https://github.com/openclaw/skills before adding local-whisper to shared team environments
- Use local-whisper for development workflows
Works across
Favorites: 0.
Sub-skills: 0.
Aggregator: No.
Original source / Raw SKILL.md
---
name: local-whisper
description: Local speech-to-text using OpenAI Whisper. Runs fully offline after model download. High quality transcription with multiple model sizes.
metadata: {"clawdbot":{"emoji":"ποΈ","requires":{"bins":["ffmpeg"]}}}
---
# Local Whisper STT
Local speech-to-text using OpenAI's Whisper. **Fully offline** after initial model download.
## Usage
```bash
# Basic
~/.clawdbot/skills/local-whisper/scripts/local-whisper audio.wav
# Better model
~/.clawdbot/skills/local-whisper/scripts/local-whisper audio.wav --model turbo
# With timestamps
~/.clawdbot/skills/local-whisper/scripts/local-whisper audio.wav --timestamps --json
```
## Models
| Model | Size | Notes |
|-------|------|-------|
| `tiny` | 39M | Fastest |
| `base` | 74M | **Default** |
| `small` | 244M | Good balance |
| `turbo` | 809M | Best speed/quality |
| `large-v3` | 1.5GB | Maximum accuracy |
## Options
- `--model/-m` β Model size (default: base)
- `--language/-l` β Language code (auto-detect if omitted)
- `--timestamps/-t` β Include word timestamps
- `--json/-j` β JSON output
- `--quiet/-q` β Suppress progress
## Setup
Uses uv-managed venv at `.venv/`. To reinstall:
```bash
cd ~/.clawdbot/skills/local-whisper
uv venv .venv --python 3.12
uv pip install --python .venv/bin/python click openai-whisper torch --index-url https://download.pytorch.org/whl/cpu
```
---
## Skill Companion Files
> Additional files collected from the skill directory layout.
### _meta.json
```json
{
"owner": "araa47",
"slug": "local-whisper",
"displayName": "Local Whisper",
"latest": {
"version": "1.0.0",
"publishedAt": 1769159934671,
"commit": "https://github.com/clawdbot/skills/commit/1e24c04d50efd58d32a114d979970065bf09bf42"
},
"history": []
}
```
### scripts/transcribe.py
```python
#!/usr/bin/env python3
"""Local speech-to-text using OpenAI Whisper (runs offline after model download)."""
import json
import sys
import warnings
import click
warnings.filterwarnings("ignore")
MODELS = ["tiny", "tiny.en", "base", "base.en", "small", "small.en",
"medium", "medium.en", "large-v3", "turbo"]
@click.command()
@click.argument("audio_file", type=click.Path(exists=True))
@click.option("-m", "--model", default="base", type=click.Choice(MODELS), help="Whisper model size")
@click.option("-l", "--language", default=None, help="Language code (auto-detect if omitted)")
@click.option("-t", "--timestamps", is_flag=True, help="Include word-level timestamps")
@click.option("-j", "--json", "as_json", is_flag=True, help="Output as JSON")
@click.option("-q", "--quiet", is_flag=True, help="Suppress progress messages")
def main(audio_file, model, language, timestamps, as_json, quiet):
"""Transcribe audio using OpenAI Whisper (local)."""
try:
import whisper
except ImportError:
click.echo("Error: openai-whisper not installed", err=True)
sys.exit(1)
if not quiet:
click.echo(f"Loading model: {model}...", err=True)
try:
whisper_model = whisper.load_model(model)
except Exception as e:
click.echo(f"Error loading model: {e}", err=True)
sys.exit(1)
if not quiet:
click.echo(f"Transcribing: {audio_file}...", err=True)
try:
result = whisper_model.transcribe(audio_file, language=language,
word_timestamps=timestamps, verbose=False)
except Exception as e:
click.echo(f"Error transcribing: {e}", err=True)
sys.exit(1)
text = result["text"].strip()
if as_json:
output = {"text": text, "language": result.get("language", "unknown")}
if timestamps and "segments" in result:
output["segments"] = [
{"start": s["start"], "end": s["end"], "text": s["text"],
**({"words": s["words"]} if "words" in s else {})}
for s in result["segments"]
]
click.echo(json.dumps(output, indent=2, ensure_ascii=False))
else:
click.echo(text)
if timestamps and "segments" in result:
click.echo("\n--- Segments ---", err=True)
for seg in result["segments"]:
click.echo(f" [{seg['start']:.2f}s - {seg['end']:.2f}s]: {seg['text']}", err=True)
if __name__ == "__main__":
main()
```