SkillHub ClubShip Full StackFull Stack

skillbench

Track skill versions, benchmark performance, compare improvements, and get self-improvement signals. Integrates with tasktime and ClawVault.

Packaged view

This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.

Stars

3,126

Hot score

Updated

March 20, 2026

Overall rating

C0.0

Composite score

0.0

Best-practice grade

B84.0

Install command

npx @skill-hub/cli install openclaw-skills-skillbench

Repository

openclaw/skills

Skill path: skills/g9pedro/skillbench

Track skill versions, benchmark performance, compare improvements, and get self-improvement signals. Integrates with tasktime and ClawVault.

Open repository

Best for

Primary workflow: Ship Full Stack.

Technical facets: Full Stack.

Target audience: everyone.

License: Unknown.

Original source

Catalog source: SkillHub Club.

Repository owner: openclaw.

This is still a mirrored public skill entry. Review the repository before installing into production workflows.

What it helps with

Install skillbench into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
Review https://github.com/openclaw/skills before adding skillbench to shared team environments
Use skillbench for development workflows

Works across

Claude CodeCodex CLIGemini CLIOpenCode

Favorites: 0.

Sub-skills: 0.

Aggregator: No.

Original source / Raw SKILL.md

---
name: skillbench
description: Track skill versions, benchmark performance, compare improvements, and get self-improvement signals. Integrates with tasktime and ClawVault.
metadata:
  openclaw:
    requires:
      bins: [skillbench]
    install:
      - id: node
        kind: node
        package: "@versatly/skillbench"
        bins: [skillbench]
        label: Install SkillBench CLI (npm)
---

# skillbench Skill

**Self-improving skill ecosystem for AI agents.**

Track skill versions, benchmark performance, compare improvements, and get signals on what to fix next.

**Part of the [ClawVault](https://clawvault.dev) ecosystem** | [tasktime](https://clawhub.com/skills/tasktime) | [ClawHub](https://clawhub.com)

## Installation

```bash
npm install -g @versatly/skillbench
```

## The Loop

```
1. Use a skill    → skillbench use [email protected]
2. Do the task    → tt start "Create PR" && ... && tt stop
3. Record result  → skillbench record "Create PR" --success
4. Check scores   → skillbench score github
5. Improve skill  → Update skill, bump version
6. Repeat         → Compare v1.0.0 vs v1.1.0
```

## Commands

### Track Skills
```bash
skillbench use [email protected]            # Set active skill version
skillbench skills                       # List tracked skills + signals
```

### Record Benchmarks
```bash
# Auto-pulls duration from tasktime
skillbench record "Create PR" --success

# Manual duration
skillbench record "Create PR" --duration 45s --success

# Record failures
skillbench record "Create PR" --fail --error-type "auth-error"
```

### Score & Compare
```bash
skillbench score                        # All skills with grades
skillbench score github                 # Single skill
skillbench compare [email protected] [email protected]
```

### Export & Dashboard
```bash
skillbench export --format markdown
skillbench export --format json
skillbench dashboard                    # Generate HTML dashboard
skillbench dashboard --open             # Generate and open in browser
```

### Automated Testing
```bash
skillbench test [email protected]          # Run smoke test
skillbench test [email protected] --suite full  # Run named suite
skillbench test [email protected] --dry-run     # Test without recording
```

### Sync
```bash
skillbench sync --clawhub               # Import installed skills
skillbench sync --vault                 # Sync to ClawVault
skillbench sync --all                   # Everything
```

### Health & Monitoring
```bash
skillbench health                       # Overall health report with alerts
skillbench watch --once                 # Run all test suites once
skillbench watch --interval 300         # Continuous monitoring every 5 min
```

### Analysis & Improvement
```bash
skillbench improve                      # Get suggestions for weakest skill
skillbench improve github               # Improvement plan for specific skill
skillbench trend tasktime --days 30     # Performance trend over time
skillbench leaderboard                  # Compare agents (multi-agent setups)
skillbench schedule --interval 60       # Generate cron config for auto-testing
```

### Baselines & Regression Detection
```bash
skillbench baseline tasktime --set      # Set baseline from current performance
skillbench baseline --list              # List all baselines
skillbench baseline --check             # Check all baselines (CI-friendly, exits 1 if failing)
skillbench baseline tasktime --remove   # Remove a baseline
```

### CI/CD Integration
```bash
skillbench ci                           # Run all tests + baseline checks
skillbench ci --json                    # JSON output for automation
skillbench badge                        # Generate shields.io badges for README
```

Copy `examples/github-action.yml` for ready-to-use GitHub Actions workflow.

## Grading System

| Grade | Score | Meaning |
|-------|-------|---------|
| 🏆 A+ | 95-100 | Elite performance |
| ✅ A | 85-94 | Excellent |
| 👍 B | 70-84 | Good |
| ⚠️ C | 50-69 | Needs work |
| ❌ D | <50 | Broken |

Based on: Success Rate (40%), Avg Duration (30%), Consistency (20%), Trend (10%)

## tasktime Integration

When you omit `--duration`, skillbench pulls from [tasktime](https://clawhub.com/skills/tasktime):

```bash
tt start "Create PR" -c git
# ... do work ...
tt stop
skillbench record --success   # Duration auto-pulled
```

## ClawVault Integration

Benchmarks sync to [ClawVault](https://clawvault.dev) automatically.

## Improvement Signals

`skillbench skills` shows:
- ⚠️ **needs work** — Success rate below 70%
- 🕐 **stale** — No benchmarks in 7+ days
- ↘️ **declining** — Getting worse over time

## Related

- [ClawVault](https://clawvault.dev) — Memory system for AI agents
- [tasktime](https://clawhub.com/skills/tasktime) — Task timer CLI
- [ClawHub](https://clawhub.com) — Skill marketplace


---

## Skill Companion Files

> Additional files collected from the skill directory layout.

### _meta.json

```json
{
  "owner": "g9pedro",
  "slug": "skillbench",
  "displayName": "SkillBench",
  "latest": {
    "version": "2.0.0",
    "publishedAt": 1770744521286,
    "commit": "https://github.com/openclaw/skills/commit/5c9cf5f07898e1e05229817279637e2edb2d898a"
  },
  "history": [
    {
      "version": "1.7.0",
      "publishedAt": 1770711955253,
      "commit": "https://github.com/openclaw/skills/commit/2cffc152a5e04bd128c3e0df2cc0eea6fefdff62"
    },
    {
      "version": "1.0.0",
      "publishedAt": 1770710161839,
      "commit": "https://github.com/openclaw/skills/commit/625f0da79faa2ccfe072fe0034dd3dc76c16085b"
    }
  ]
}

```