Back to skills
SkillHub ClubShip Full StackFull Stack

local-whisper

Local speech-to-text using OpenAI Whisper. Runs fully offline after model download. High quality transcription with multiple model sizes.

Packaged view

This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.

Stars
3,051
Hot score
99
Updated
March 20, 2026
Overall rating
C4.0
Composite score
4.0
Best-practice grade
B84.0

Install command

npx @skill-hub/cli install openclaw-skills-local-whisper

Repository

openclaw/skills

Skill path: skills/araa47/local-whisper

Local speech-to-text using OpenAI Whisper. Runs fully offline after model download. High quality transcription with multiple model sizes.

Open repository

Best for

Primary workflow: Ship Full Stack.

Technical facets: Full Stack.

Target audience: everyone.

License: Unknown.

Original source

Catalog source: SkillHub Club.

Repository owner: openclaw.

This is still a mirrored public skill entry. Review the repository before installing into production workflows.

What it helps with

  • Install local-whisper into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
  • Review https://github.com/openclaw/skills before adding local-whisper to shared team environments
  • Use local-whisper for development workflows

Works across

Claude CodeCodex CLIGemini CLIOpenCode

Favorites: 0.

Sub-skills: 0.

Aggregator: No.

Original source / Raw SKILL.md

---
name: local-whisper
description: Local speech-to-text using OpenAI Whisper. Runs fully offline after model download. High quality transcription with multiple model sizes.
metadata: {"clawdbot":{"emoji":"πŸŽ™οΈ","requires":{"bins":["ffmpeg"]}}}
---

# Local Whisper STT

Local speech-to-text using OpenAI's Whisper. **Fully offline** after initial model download.

## Usage

```bash
# Basic
~/.clawdbot/skills/local-whisper/scripts/local-whisper audio.wav

# Better model
~/.clawdbot/skills/local-whisper/scripts/local-whisper audio.wav --model turbo

# With timestamps
~/.clawdbot/skills/local-whisper/scripts/local-whisper audio.wav --timestamps --json
```

## Models

| Model | Size | Notes |
|-------|------|-------|
| `tiny` | 39M | Fastest |
| `base` | 74M | **Default** |
| `small` | 244M | Good balance |
| `turbo` | 809M | Best speed/quality |
| `large-v3` | 1.5GB | Maximum accuracy |

## Options

- `--model/-m` β€” Model size (default: base)
- `--language/-l` β€” Language code (auto-detect if omitted)
- `--timestamps/-t` β€” Include word timestamps
- `--json/-j` β€” JSON output
- `--quiet/-q` β€” Suppress progress

## Setup

Uses uv-managed venv at `.venv/`. To reinstall:
```bash
cd ~/.clawdbot/skills/local-whisper
uv venv .venv --python 3.12
uv pip install --python .venv/bin/python click openai-whisper torch --index-url https://download.pytorch.org/whl/cpu
```


---

## Skill Companion Files

> Additional files collected from the skill directory layout.

### _meta.json

```json
{
  "owner": "araa47",
  "slug": "local-whisper",
  "displayName": "Local Whisper",
  "latest": {
    "version": "1.0.0",
    "publishedAt": 1769159934671,
    "commit": "https://github.com/clawdbot/skills/commit/1e24c04d50efd58d32a114d979970065bf09bf42"
  },
  "history": []
}

```

### scripts/transcribe.py

```python
#!/usr/bin/env python3
"""Local speech-to-text using OpenAI Whisper (runs offline after model download)."""

import json
import sys
import warnings

import click

warnings.filterwarnings("ignore")

MODELS = ["tiny", "tiny.en", "base", "base.en", "small", "small.en",
          "medium", "medium.en", "large-v3", "turbo"]


@click.command()
@click.argument("audio_file", type=click.Path(exists=True))
@click.option("-m", "--model", default="base", type=click.Choice(MODELS), help="Whisper model size")
@click.option("-l", "--language", default=None, help="Language code (auto-detect if omitted)")
@click.option("-t", "--timestamps", is_flag=True, help="Include word-level timestamps")
@click.option("-j", "--json", "as_json", is_flag=True, help="Output as JSON")
@click.option("-q", "--quiet", is_flag=True, help="Suppress progress messages")
def main(audio_file, model, language, timestamps, as_json, quiet):
    """Transcribe audio using OpenAI Whisper (local)."""
    try:
        import whisper
    except ImportError:
        click.echo("Error: openai-whisper not installed", err=True)
        sys.exit(1)

    if not quiet:
        click.echo(f"Loading model: {model}...", err=True)

    try:
        whisper_model = whisper.load_model(model)
    except Exception as e:
        click.echo(f"Error loading model: {e}", err=True)
        sys.exit(1)

    if not quiet:
        click.echo(f"Transcribing: {audio_file}...", err=True)

    try:
        result = whisper_model.transcribe(audio_file, language=language,
                                          word_timestamps=timestamps, verbose=False)
    except Exception as e:
        click.echo(f"Error transcribing: {e}", err=True)
        sys.exit(1)

    text = result["text"].strip()

    if as_json:
        output = {"text": text, "language": result.get("language", "unknown")}
        if timestamps and "segments" in result:
            output["segments"] = [
                {"start": s["start"], "end": s["end"], "text": s["text"],
                 **({"words": s["words"]} if "words" in s else {})}
                for s in result["segments"]
            ]
        click.echo(json.dumps(output, indent=2, ensure_ascii=False))
    else:
        click.echo(text)
        if timestamps and "segments" in result:
            click.echo("\n--- Segments ---", err=True)
            for seg in result["segments"]:
                click.echo(f"  [{seg['start']:.2f}s - {seg['end']:.2f}s]: {seg['text']}", err=True)


if __name__ == "__main__":
    main()

```

local-whisper | SkillHub