Back to skills
SkillHub ClubAnalyze Data & AIFull StackData / AI

livekit-voice

LiveKit real-time voice and video infrastructure β€” create rooms, generate JWT access tokens, manage participants, and record sessions. Open source WebRTC for voice AI agents and real-time communication. Use for building voice agents, video rooms, or real-time audio.

Packaged view

This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.

Stars
3,074
Hot score
99
Updated
March 20, 2026
Overall rating
C4.0
Composite score
4.0
Best-practice grade
A92.0

Install command

npx @skill-hub/cli install openclaw-skills-livekit-voice

Repository

openclaw/skills

Skill path: skills/aiwithabidi/livekit-voice

LiveKit real-time voice and video infrastructure β€” create rooms, generate JWT access tokens, manage participants, and record sessions. Open source WebRTC for voice AI agents and real-time communication. Use for building voice agents, video rooms, or real-time audio.

Open repository

Best for

Primary workflow: Analyze Data & AI.

Technical facets: Full Stack, Data / AI.

Target audience: everyone.

License: MIT.

Original source

Catalog source: SkillHub Club.

Repository owner: openclaw.

This is still a mirrored public skill entry. Review the repository before installing into production workflows.

What it helps with

  • Install livekit-voice into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
  • Review https://github.com/openclaw/skills before adding livekit-voice to shared team environments
  • Use livekit-voice for development workflows

Works across

Claude CodeCodex CLIGemini CLIOpenCode

Favorites: 0.

Sub-skills: 0.

Aggregator: No.

Original source / Raw SKILL.md

---
name: livekit-voice
description: LiveKit real-time voice and video infrastructure β€” create rooms, generate JWT access tokens, manage participants, and record sessions. Open source WebRTC for voice AI agents and real-time communication. Use for building voice agents, video rooms, or real-time audio.
homepage: https://www.agxntsix.ai
license: MIT
compatibility: Python 3.10+, LiveKit Cloud or self-hosted
metadata: {"openclaw": {"emoji": "\ud83c\udfa7", "requires": {"env": ["LIVEKIT_API_KEY", "LIVEKIT_API_SECRET"]}, "primaryEnv": "LIVEKIT_API_KEY", "homepage": "https://www.agxntsix.ai"}}
---

# 🎧 LiveKit Voice

LiveKit real-time voice/video infrastructure for OpenClaw agents. Create rooms, generate tokens, manage participants, and integrate with voice AI platforms.

## What is LiveKit?

[LiveKit](https://livekit.io) is an open-source WebRTC infrastructure platform for building real-time audio/video applications. It powers voice AI agents, video conferencing, live streaming, and more.

**Self-hosted vs Cloud:**
- **LiveKit Cloud** β€” Managed service, no infrastructure to maintain
- **Self-hosted** β€” Deploy on your own servers via Docker/Kubernetes

## Requirements

| Variable | Required | Description |
|----------|----------|-------------|
| `LIVEKIT_API_KEY` | βœ… | LiveKit API key |
| `LIVEKIT_API_SECRET` | βœ… | LiveKit API secret |
| `LIVEKIT_URL` | βœ… | LiveKit server URL (e.g. `wss://your-project.livekit.cloud`) |

## Quick Start

```bash
# Create a room
python3 {baseDir}/scripts/livekit_api.py create-room my-room

# Create room with options
python3 {baseDir}/scripts/livekit_api.py create-room my-room --max-participants 10 --empty-timeout 300

# Generate access token for a participant
python3 {baseDir}/scripts/livekit_api.py token my-room --identity user123 --name "John"

# Generate token with specific grants
python3 {baseDir}/scripts/livekit_api.py token my-room --identity agent --can-publish --can-subscribe

# List active rooms
python3 {baseDir}/scripts/livekit_api.py list-rooms

# List participants in a room
python3 {baseDir}/scripts/livekit_api.py participants my-room

# Delete a room
python3 {baseDir}/scripts/livekit_api.py delete-room my-room

# Start recording (Egress)
python3 {baseDir}/scripts/livekit_api.py record my-room --output s3://bucket/recording.mp4
```

## Commands

### `create-room <name>`
Create a new LiveKit room.
- `--max-participants N` β€” limit participants
- `--empty-timeout N` β€” seconds before empty room auto-closes (default 300)

### `token <room>`
Generate a JWT access token for a participant.
- `--identity ID` β€” participant identity (required)
- `--name NAME` β€” display name
- `--can-publish` β€” allow publishing audio/video
- `--can-subscribe` β€” allow subscribing to others
- `--ttl N` β€” token TTL in seconds (default 3600)

### `list-rooms`
List all active rooms with participant counts.

### `participants <room>`
List participants in a room with their connection state and tracks.

### `delete-room <name>`
Delete/close a room and disconnect all participants.

### `record <room>`
Start an Egress recording of a room.
- `--output URL` β€” output destination (S3, GCS, or local path)

## Voice AI Integration

LiveKit is the backbone for many voice AI platforms:

- **Vapi** β€” Uses LiveKit for real-time voice AI agent calls
- **ElevenLabs** β€” Stream TTS audio into LiveKit rooms
- **OpenAI Realtime** β€” Connect GPT-4o voice to LiveKit participants

### Agent Pattern
1. Create a LiveKit room
2. Generate tokens for both human and AI agent
3. AI agent joins, subscribes to human audio
4. Process audio β†’ STT β†’ LLM β†’ TTS β†’ publish back
5. Result: real-time voice conversation with AI

## Credits
Built by [M. Abidi](https://www.linkedin.com/in/mohammad-ali-abidi) | [agxntsix.ai](https://www.agxntsix.ai)
[YouTube](https://youtube.com/@aiwithabidi) | [GitHub](https://github.com/aiwithabidi)
Part of the **AgxntSix Skill Suite** for OpenClaw agents.

πŸ“… **Need help setting up OpenClaw for your business?** [Book a free consultation](https://cal.com/agxntsix/abidi-openclaw)


---

## Skill Companion Files

> Additional files collected from the skill directory layout.

### _meta.json

```json
{
  "owner": "aiwithabidi",
  "slug": "livekit-voice",
  "displayName": "Livekit Voice",
  "latest": {
    "version": "1.0.0",
    "publishedAt": 1772672040956,
    "commit": "https://github.com/openclaw/skills/commit/a0eac186e12d05fa63839dca155543b3052fda1b"
  },
  "history": []
}

```

### scripts/livekit_api.py

```python
#!/usr/bin/env python3
"""LiveKit real-time voice/video API integration for OpenClaw agents."""

import argparse
import base64
import hashlib
import hmac
import json
import os
import sys
import time
from urllib.request import Request, urlopen
from urllib.error import HTTPError


def get_config():
    api_key = os.environ.get("LIVEKIT_API_KEY")
    api_secret = os.environ.get("LIVEKIT_API_SECRET")
    url = os.environ.get("LIVEKIT_URL")
    if not all([api_key, api_secret, url]):
        print("Error: LIVEKIT_API_KEY, LIVEKIT_API_SECRET, and LIVEKIT_URL must be set", file=sys.stderr)
        sys.exit(1)
    return api_key, api_secret, url


def b64url_encode(data):
    if isinstance(data, str):
        data = data.encode()
    return base64.urlsafe_b64encode(data).rstrip(b"=").decode()


def generate_jwt(api_key, api_secret, grants, ttl=3600):
    """Generate a LiveKit-compatible JWT access token."""
    header = {"alg": "HS256", "typ": "JWT"}
    now = int(time.time())
    payload = {
        "iss": api_key,
        "nbf": now,
        "exp": now + ttl,
        "sub": grants.get("identity", ""),
        "name": grants.get("name", ""),
        "video": grants.get("video", {}),
    }
    segments = [b64url_encode(json.dumps(header)), b64url_encode(json.dumps(payload))]
    signing_input = ".".join(segments).encode()
    signature = hmac.new(api_secret.encode(), signing_input, hashlib.sha256).digest()
    segments.append(b64url_encode(signature))
    return ".".join(segments)


def twirp_request(url, api_key, api_secret, service, method, body=None):
    """Make a Twirp RPC request to LiveKit server."""
    # Convert wss:// to https:// for API calls
    http_url = url.replace("wss://", "https://").replace("ws://", "http://")
    endpoint = f"{http_url}/twirp/livekit.{service}/{method}"

    token = generate_jwt(api_key, api_secret, {
        "video": {"roomCreate": True, "roomList": True, "roomAdmin": True, "roomRecord": True}
    })

    data = json.dumps(body or {}).encode()
    req = Request(endpoint, data=data, headers={
        "Authorization": f"Bearer {token}",
        "Content-Type": "application/json",
    })
    try:
        with urlopen(req) as resp:
            return json.loads(resp.read().decode())
    except HTTPError as e:
        body_text = e.read().decode() if e.fp else ""
        print(f"API Error {e.code}: {body_text}", file=sys.stderr)
        sys.exit(1)


def cmd_create_room(args):
    api_key, api_secret, url = get_config()
    body = {"name": args.name}
    if args.max_participants:
        body["maxParticipants"] = args.max_participants
    if args.empty_timeout:
        body["emptyTimeout"] = args.empty_timeout

    result = twirp_request(url, api_key, api_secret, "RoomService", "CreateRoom", body)
    print(f"  Room created: {result.get('name', args.name)}")
    print(f"  SID: {result.get('sid', 'N/A')}")
    print(f"  Max Participants: {result.get('maxParticipants', 'unlimited')}")


def cmd_token(args):
    api_key, api_secret, _ = get_config()
    video_grants = {"room": args.room, "roomJoin": True}
    if args.can_publish:
        video_grants["canPublish"] = True
    if args.can_subscribe:
        video_grants["canSubscribe"] = True

    grants = {
        "identity": args.identity,
        "name": args.name or args.identity,
        "video": video_grants,
    }
    token = generate_jwt(api_key, api_secret, grants, ttl=args.ttl)
    print(f"  Token for '{args.identity}' in room '{args.room}':")
    print(f"  {token}")


def cmd_list_rooms(args):
    api_key, api_secret, url = get_config()
    result = twirp_request(url, api_key, api_secret, "RoomService", "ListRooms", {})
    rooms = result.get("rooms", [])
    if not rooms:
        print("  No active rooms.")
        return
    for r in rooms:
        print(f"  {r.get('name','?')} | SID: {r.get('sid','?')} | Participants: {r.get('numParticipants',0)} | Created: {r.get('creationTime','?')}")


def cmd_participants(args):
    api_key, api_secret, url = get_config()
    result = twirp_request(url, api_key, api_secret, "RoomService", "ListParticipants", {"room": args.room})
    participants = result.get("participants", [])
    if not participants:
        print(f"  No participants in '{args.room}'.")
        return
    for p in participants:
        tracks = len(p.get("tracks", []))
        print(f"  {p.get('identity','?')} ({p.get('name','?')}) | State: {p.get('state','?')} | Tracks: {tracks}")


def cmd_delete_room(args):
    api_key, api_secret, url = get_config()
    twirp_request(url, api_key, api_secret, "RoomService", "DeleteRoom", {"room": args.name})
    print(f"  Room '{args.name}' deleted.")


def cmd_record(args):
    api_key, api_secret, url = get_config()
    body = {
        "roomCompositeRequest": {
            "roomName": args.room,
            "layout": "speaker-dark",
        }
    }
    if args.output:
        if args.output.startswith("s3://"):
            body["roomCompositeRequest"]["fileOutputs"] = [{"s3": {"bucket": args.output.split("/")[2], "prefix": "/".join(args.output.split("/")[3:])}}]
        else:
            body["roomCompositeRequest"]["file"] = {"filepath": args.output}

    result = twirp_request(url, api_key, api_secret, "Egress", "StartRoomCompositeEgress", body)
    print(f"  Recording started: {result.get('egressId', 'N/A')}")
    print(f"  Status: {result.get('status', 'N/A')}")


def main():
    parser = argparse.ArgumentParser(description="LiveKit Voice/Video API")
    sub = parser.add_subparsers(dest="command", required=True)

    p_cr = sub.add_parser("create-room", help="Create a room")
    p_cr.add_argument("name")
    p_cr.add_argument("--max-participants", type=int)
    p_cr.add_argument("--empty-timeout", type=int, default=300)

    p_tok = sub.add_parser("token", help="Generate access token")
    p_tok.add_argument("room")
    p_tok.add_argument("--identity", required=True)
    p_tok.add_argument("--name")
    p_tok.add_argument("--can-publish", action="store_true")
    p_tok.add_argument("--can-subscribe", action="store_true")
    p_tok.add_argument("--ttl", type=int, default=3600)

    sub.add_parser("list-rooms", help="List active rooms")

    p_part = sub.add_parser("participants", help="List participants")
    p_part.add_argument("room")

    p_del = sub.add_parser("delete-room", help="Delete a room")
    p_del.add_argument("name")

    p_rec = sub.add_parser("record", help="Start recording")
    p_rec.add_argument("room")
    p_rec.add_argument("--output")

    args = parser.parse_args()
    cmds = {
        "create-room": cmd_create_room, "token": cmd_token, "list-rooms": cmd_list_rooms,
        "participants": cmd_participants, "delete-room": cmd_delete_room, "record": cmd_record,
    }
    cmds[args.command](args)


if __name__ == "__main__":
    main()

```

livekit-voice | SkillHub