Back to skills
SkillHub ClubDesign ProductFull StackDesigner

listenhub

Explain anything — turn ideas into podcasts, explainer videos, or voice narration. Use when the user wants to "make a podcast", "create an explainer video", "read this aloud", "generate an image", or share knowledge in audio/visual form. Supports: topic descriptions, YouTube links, article URLs, plain text, and image prompts.

Packaged view

This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.

Stars
2,903
Hot score
99
Updated
March 20, 2026
Overall rating
C4.5
Composite score
4.5
Best-practice grade
B71.9

Install command

npx @skill-hub/cli install openclaw-skills-listenhub-2

Repository

openclaw/skills

Skill path: skills/0xfango/listenhub-2

Explain anything — turn ideas into podcasts, explainer videos, or voice narration. Use when the user wants to "make a podcast", "create an explainer video", "read this aloud", "generate an image", or share knowledge in audio/visual form. Supports: topic descriptions, YouTube links, article URLs, plain text, and image prompts.

Open repository

Best for

Primary workflow: Design Product.

Technical facets: Full Stack, Designer.

Target audience: everyone.

License: Unknown.

Original source

Catalog source: SkillHub Club.

Repository owner: openclaw.

This is still a mirrored public skill entry. Review the repository before installing into production workflows.

What it helps with

  • Install listenhub into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
  • Review https://github.com/openclaw/skills before adding listenhub to shared team environments
  • Use listenhub for development workflows

Works across

Claude CodeCodex CLIGemini CLIOpenCode

Favorites: 0.

Sub-skills: 0.

Aggregator: No.

Original source / Raw SKILL.md

---
name: listenhub
description: |
  Explain anything — turn ideas into podcasts, explainer videos, or voice narration.
  Use when the user wants to "make a podcast", "create an explainer video",
  "read this aloud", "generate an image", or share knowledge in audio/visual form.
  Supports: topic descriptions, YouTube links, article URLs, plain text, and image prompts.
---

<purpose>
**The Hook**: Paste content, get audio/video/image. That simple.

Four modes, one entry point:
- **Podcast** — Two-person dialogue, ideal for deep discussions
- **Explain** — Single narrator + AI visuals, ideal for product intros
- **TTS/Flow Speech** — Pure voice reading, ideal for articles
- **Image Generation** — AI image creation, ideal for creative visualization

Users don't need to remember APIs, modes, or parameters. Just say what you want.
</purpose>

<instructions>

## ⛔ Hard Constraints (Inviolable)

**The scripts are the ONLY interface. Period.**

```
┌─────────────────────────────────────────────────────────┐
│  AI Agent  ──▶  ./scripts/*.sh  ──▶  ListenHub API     │
│                      ▲                                  │
│                      │                                  │
│            This is the ONLY path.                       │
│            Direct API calls are FORBIDDEN.              │
└─────────────────────────────────────────────────────────┘
```

**MUST**:
- Execute functionality ONLY through provided scripts in `**/skills/listenhub/scripts/`
- Pass user intent as script arguments exactly as documented
- Trust script outputs; do not second-guess internal logic

**MUST NOT**:
- Write curl commands to ListenHub/Marswave API directly
- Construct JSON bodies for API calls manually
- Guess or fabricate speakerIds, endpoints, or API parameters
- Assume API structure based on patterns or web searches
- Hallucinate features not exposed by existing scripts

**Why**: The API is proprietary. Endpoints, parameters, and speakerIds are NOT publicly documented. Web searches will NOT find this information. Any attempt to bypass scripts will produce incorrect, non-functional code.

## Script Location

Scripts are located at `**/skills/listenhub/scripts/` relative to your working context.

Different AI clients use different dot-directories:
- Claude Code: `.claude/skills/listenhub/scripts/`
- Other clients: may vary (`.cursor/`, `.windsurf/`, etc.)

**Resolution**: Use glob pattern `**/skills/listenhub/scripts/*.sh` to locate scripts reliably, or resolve from the SKILL.md file's own path.

## Private Data (Cannot Be Searched)

The following are **internal implementation details** that AI cannot reliably know:

| Category | Examples | How to Obtain |
|----------|----------|---------------|
| API Base URL | `api.marswave.ai/...` | ✗ Cannot — internal to scripts |
| Endpoints | `podcast/episodes`, etc. | ✗ Cannot — internal to scripts |
| Speaker IDs | `cozy-man-english`, etc. | ✓ Call `get-speakers.sh` |
| Request schemas | JSON body structure | ✗ Cannot — internal to scripts |
| Response formats | Episode ID, status codes | ✓ Documented per script |

**Rule**: If information is not in this SKILL.md or retrievable via a script (like `get-speakers.sh`), assume you don't know it.

## Design Philosophy

**Hide complexity, reveal magic.**

Users don't need to know: Episode IDs, API structure, polling mechanisms, credits, endpoint differences.
Users only need: Say idea → wait a moment → get the link.

## Security

- User-provided content (text, URLs) is transmitted to the ListenHub API (`api.marswave.ai`) for processing. Do not pass sensitive or confidential information as input.
- The `--source-url` parameter accepts external URLs whose content is fetched and processed by the backend. Only use trusted URLs.
- API keys are stored locally in environment variables and transmitted via HTTPS. Never log or display full API keys.
- Version checks connect to `raw.githubusercontent.com` (read-only, no code execution). Set `LISTENHUB_SKIP_VERSION_CHECK=1` to disable.

## Environment

### ListenHub API Key

API key stored in `$LISTENHUB_API_KEY`. Check on first use:

```bash
source ~/.zshrc 2>/dev/null; [ -n "$LISTENHUB_API_KEY" ] && echo "ready" || echo "need_setup"
```

If setup needed, guide user:
1. Visit https://listenhub.ai/settings/api-keys
2. Paste key (only the `lh_sk_...` part)
3. Auto-save to ~/.zshrc

### Image Generation API Key

Image generation uses the same ListenHub API key stored in `$LISTENHUB_API_KEY`.
Image generation output path defaults to the user downloads directory, stored in `$LISTENHUB_OUTPUT_DIR`.

On first image generation, the script auto-guides configuration:
1. Visit https://listenhub.ai/settings/api-keys (requires subscription)
2. Paste API key
3. Configure output path (default: ~/Downloads)
4. Auto-save to shell rc file

**Security**: Never expose full API keys in output.

## Mode Detection

Auto-detect mode from user input:

**→ Podcast (1-2 speakers)**
Supports single-speaker or dual-speaker podcasts. Debate mode requires 2 speakers.
Default mode: `quick` unless explicitly requested.
If speakers are not specified, call `get-speakers.sh` and select the first `speakerId`
matching the chosen `language`.
If reference materials are provided, pass them as `--source-url` or `--source-text`.
When the user only provides a topic (e.g., "I want a podcast about X"), proceed with:
1) detect `language` from user input,
2) set `mode=quick`,
3) choose one speaker via `get-speakers.sh` matching the language,
4) create a single-speaker podcast without further clarification.
1. Keywords: "podcast", "chat about", "discuss", "debate", "dialogue"
2. Use case: Topic exploration, opinion exchange, deep analysis
- Feature: Two voices, interactive feel

**→ Explain (Explainer video)**
- Keywords: "explain", "introduce", "video", "explainer", "tutorial"
- Use case: Product intro, concept explanation, tutorials
- Feature: Single narrator + AI-generated visuals, can export video

**→ TTS (Text-to-speech)**
TTS defaults to FlowSpeech `direct` for single-pass text or URL narration.
Script arrays and multi-speaker dialogue belong to Speech as an advanced path, not the default TTS entry.
Text-to-speech input is limited to 10,000 characters; split or use a URL when longer.
1. Keywords: "read aloud", "convert to speech", "tts", "voice"
2. Use case: Article to audio, note review, document narration
3. Feature: Fastest (1-2 min), pure audio

### Ambiguous "Convert to speech" Guidance

When the request is ambiguous (e.g., "convert to speech", "read aloud"), apply:

1. Default to FlowSpeech and prioritize `direct` to avoid altering content.
2. Input type: URL uses `type=url`, plain text uses `type=text`.
3. Speaker: if not specified, call `get-speakers` and pick the first `speakerId` matching `language`.
4. Switch to Speech only when multi-line scripts or multi-speaker dialogue is explicitly requested, and require `scripts`.

Example guidance:

“This request can use FlowSpeech with the default direct mode; switch to smart for grammar and punctuation fixes. For per-line speaker assignment, provide scripts and switch to Speech.”

**→ Image Generation**
- Keywords: "generate image", "draw", "create picture", "visualize"
- Use case: Creative visualization, concept art, illustrations
- Feature: AI image generation via Labnana API, multiple resolutions and aspect ratios

**Reference Images via Image Hosts**
When reference images are local files, upload to a known image host and use the direct image URL in `--reference-images`.
Recommended hosts: `imgbb.com`, `sm.ms`, `postimages.org`, `imgur.com`.
Direct image URLs should end with `.jpg`, `.png`, `.webp`, or `.gif`.

**Default**: If unclear, ask user which format they prefer.

**Explicit override**: User can say "make it a podcast" / "I want explainer video" / "just voice" / "generate image" to override auto-detection.

## Interaction Flow

### Step 1: Receive input + detect mode

```
→ Got it! Preparing...
  Mode: Two-person podcast
  Topic: Latest developments in Manus AI
```

For URLs, identify type:
- `youtu.be/XXX` → convert to `https://www.youtube.com/watch?v=XXX`
- Other URLs → use directly

### Step 2: Submit generation

```
→ Generation submitted

  Estimated time:
  • Podcast: 2-3 minutes
  • Explain: 3-5 minutes
  • TTS: 1-2 minutes

  You can:
  • Wait and ask "done yet?"
  • Use check-status via scripts
  • View outputs in product pages:
    - Podcast: https://listenhub.ai/app/podcast
    - Explain: https://listenhub.ai/app/explainer
    - Text-to-Speech: https://listenhub.ai/app/text-to-speech
  • Do other things, ask later
```

Internally remember Episode ID for status queries.

### Step 3: Query status

When user says "done yet?" / "ready?" / "check status":

- **Success**: Show result + next options
- **Processing**: "Still generating, wait another minute?"
- **Failed**: "Generation failed, content might be unparseable. Try another?"

### Step 4: Show results

**Podcast result**:
```
✓ Podcast generated!

  "{title}"

  Episode: https://listenhub.ai/app/episode/{episodeId}

  Duration: ~{duration} minutes

  Download audio: provide audioUrl or audioStreamUrl on request
```
One-stage podcast creation generates an online task. When status is success,
the episode detail already includes scripts and audio URLs. Download uses the
returned audioUrl or audioStreamUrl without a second create call. Two-stage
creation is only for script review or manual edits before audio generation.

**Explain result**:
```
✓ Explainer video generated!

  "{title}"

  Watch: https://listenhub.ai/app/explainer

  Duration: ~{duration} minutes

  Need to download audio? Just say so.
```

**Image result**:
```
✓ Image generated!

  ~/Downloads/labnana-{timestamp}.jpg
```

Image results are file-only and not shown in the web UI.

**Important**: Prioritize web experience. Only provide download URLs when user explicitly requests.

## Script Reference

Scripts are shell-based. Locate via `**/skills/listenhub/scripts/`.
Dependency: `jq` is required for request construction.
The AI must ensure `curl` and `jq` are installed before invoking scripts.

**⚠️ Long-running Tasks**: Generation may take 1-5 minutes. Use your CLI client's native background execution feature:

- **Claude Code**: set `run_in_background: true` in Bash tool
- **Other CLIs**: use built-in async/background job management if available

**Invocation pattern**:

```bash
$SCRIPTS/script-name.sh [args]
```

Where `$SCRIPTS` = resolved path to `**/skills/listenhub/scripts/`

### Podcast (One-Stage)
Default path. Use unless script review or manual editing is required.
```bash
$SCRIPTS/create-podcast.sh --query "The future of AI development" --language en --mode deep --speakers cozy-man-english
$SCRIPTS/create-podcast.sh --query "Analyze this article" --language en --mode deep --speakers cozy-man-english --source-url "https://example.com/article"
```

Multiple `--source-url` and `--source-text` arguments are supported to combine several references in one request.

### Podcast (Two-Stage: Text → Review → Audio)
Advanced path. Use only when script review or edits are explicitly requested.

**The entire value of two-stage generation is human review between stages.
Skipping review reduces it to one-stage with extra latency — never do this.**

**Stage 1**: Generate text content.
```bash
$SCRIPTS/create-podcast-text.sh --query "AI history" --language en --mode deep --speakers cozy-man-english,travel-girl-english
```

**Review Gate (mandatory)**: After text generation completes, the agent MUST:

1. Run `check-status.sh --wait` to poll until completion. On exit code 2 (timeout or rate-limited), wait briefly and retry.
2. Save **two files** from the response:
   - `~/Downloads/podcast-draft-<episode-id>.md` — human-readable version assembled from the response fields (`title`, `outline`, `sourceProcessResult.content`, and the `scripts` array formatted as readable dialogue). This is for the user to review.
   - `~/Downloads/podcast-scripts-<episode-id>.json` — the raw `{"scripts": [...]}` object extracted from the response, exactly in the format that `create-podcast-audio.sh --scripts` expects. This is the machine-readable source of truth for Stage 2.
3. Inform the user that both files have been saved, and offer to open the markdown draft for review (use the `open` command on macOS).
4. **STOP and wait for explicit user approval** before proceeding to Stage 2.
5. On user approval:
   - **No changes**: run `create-podcast-audio.sh --episode <id>` without `--scripts` (server uses original).
   - **With edits**: the user may edit the JSON file directly, or describe changes for the agent to apply. Pass the modified file via `--scripts`.

The agent MUST NOT proceed to Stage 2 automatically. This is a hard constraint, not a suggestion.

**Stage 2**: Generate audio from reviewed/approved text.
```bash
# User approved without changes:
$SCRIPTS/create-podcast-audio.sh --episode "<episode-id>"

# User provided edits:
$SCRIPTS/create-podcast-audio.sh --episode "<episode-id>" --scripts modified-scripts.json
```

### Speech (Multi-Speaker)
```bash
$SCRIPTS/create-speech.sh --scripts scripts.json
echo '{"scripts":[{"content":"Hello","speakerId":"cozy-man-english"}]}' | $SCRIPTS/create-speech.sh --scripts -

# scripts.json format:
# {
#   "scripts": [
#     {"content": "Script content here", "speakerId": "speaker-id"},
#     ...
#   ]
# }
```

### Get Available Speakers
```bash
$SCRIPTS/get-speakers.sh --language zh
$SCRIPTS/get-speakers.sh --language en
```

**Guidance**:
1. 若用户未指定音色,必须先调用 `get-speakers.sh` 获取可用列表。
2. 默认值兜底:取与 `language` 匹配的列表首个 `speakerId` 作为默认音色。

**Response structure** (for AI parsing):
```json
{
  "code": 0,
  "data": {
    "items": [
      {
        "name": "Yuanye",
        "speakerId": "cozy-man-english",
        "gender": "male",
        "language": "zh"
      }
    ]
  }
}
```

**Usage**: When user requests specific voice characteristics (gender, style), call this script first to discover available `speakerId` values. NEVER hardcode or assume speakerIds.

### Explain
```bash
$SCRIPTS/create-explainer.sh --content "Introduce ListenHub" --language en --mode info --speakers cozy-man-english
$SCRIPTS/generate-video.sh --episode "<episode-id>"
```

### TTS
```bash
$SCRIPTS/create-tts.sh --type text --content "Welcome to ListenHub" --language en --mode smart --speakers cozy-man-english
```

### Image Generation
```bash
$SCRIPTS/generate-image.sh --prompt "sunset over mountains" --size 2K --ratio 16:9
$SCRIPTS/generate-image.sh --prompt "style reference" --reference-images "https://example.com/ref1.jpg,https://example.com/ref2.png"
```

Supported sizes: `1K | 2K | 4K` (default: `2K`).
Supported aspect ratios: `16:9 | 1:1 | 9:16 | 2:3 | 3:2 | 3:4 | 4:3 | 21:9` (default: `16:9`).
Reference images: comma-separated URLs, maximum 14.

### Check Status
```bash
# Single-shot query
$SCRIPTS/check-status.sh --episode "<episode-id>" --type podcast

# Wait mode (recommended for automated polling)
$SCRIPTS/check-status.sh --episode "<episode-id>" --type podcast --wait
$SCRIPTS/check-status.sh --episode "<episode-id>" --type flow-speech --wait --timeout 60
$SCRIPTS/check-status.sh --episode "<episode-id>" --type explainer --wait --timeout 600
```

`tts` is accepted as an alias for `flow-speech`.

**`--wait` mode** handles polling internally with configurable limits.
Agents SHOULD use `--wait` instead of manual polling loops. On exit code 2, wait briefly and retry the command.

| Option | Default | Description |
|---|---|---|
| `--wait` | off | Enable polling mode |
| `--max-polls` | 30 | Maximum poll attempts |
| `--timeout` | 300 | Maximum total wait (seconds) |
| `--interval` | 10 | Base poll interval (seconds) |

Exit codes: `0` = completed, `1` = failed, `2` = timeout or rate-limited (still pending, safe to retry after a short wait).

## Language Adaptation

**Automatic Language Detection**: Adapt output language based on user input and context.

**Detection Rules**:
1. **User Input Language**: If user writes in Chinese, respond in Chinese. If user writes in English, respond in English.
2. **Context Consistency**: Maintain the same language throughout the interaction unless user explicitly switches.
3. **CLAUDE.md Override**: If project-level CLAUDE.md specifies a default language, respect it unless user input indicates otherwise.
4. **Mixed Input**: If user mixes languages, prioritize the dominant language (>50% of content).

**Application**:
- Status messages: "→ Got it! Preparing..." (English) vs "→ 收到!准备中..." (Chinese)
- Error messages: Match user's language
- Result summaries: Match user's language
- Script outputs: Pass through as-is (scripts handle their own language)

**Example**:
```
User (Chinese): "生成一个关于 AI 的播客"
AI (Chinese): "→ 收到!准备双人播客..."

User (English): "Make a podcast about AI"
AI (English): "→ Got it! Preparing two-person podcast..."
```

**Principle**: Language is interface, not barrier. Adapt seamlessly to user's natural expression.

## AI Responsibilities

### Black Box Principle

**You are a dispatcher, not an implementer.**

Your job is to:
1. Understand user intent (what do they want to create?)
2. Select the correct script (which tool fits?)
3. Format arguments correctly (what parameters?)
4. Execute and relay results (what happened?)

Your job is NOT to:
- Understand or modify script internals
- Construct API calls directly
- Guess parameters not documented here
- Invent features that scripts don't expose

### Mode-Specific Behavior

**ListenHub modes (passthrough)**:
- Podcast/Explain/TTS/Speech → pass user input directly
- Server has full AI capability to process content
- If user needs specific speakers → call `get-speakers.sh` first to list options

**Labnana mode (passthrough by default)**:
- Image Generation → pass the user's prompt through as-is by default
- The generation model handles prompt interpretation; client-side rewriting is not required

## Prompt Handling (Image Generation)

**Default behavior: transparent forwarding.** Pass the user's prompt directly to the script without modification.

**When to offer optimization**:
- The user provides only a short topic or phrase (e.g., "a cat"), AND
- The user has not explicitly stated they want verbatim generation

In this case, ask whether the user would like help enriching the prompt. Do not optimize without confirmation.

**When to never modify**:
- The user pastes a long, structured, or detailed prompt — treat them as experienced
- The user explicitly says "use this prompt exactly" or similar

**If the user agrees to optimization**, the following techniques are available as reference:

**Style**: "cyberpunk" → add "neon lights, futuristic, dystopian"; "ink painting" → add "Chinese ink painting, traditional art style"

**Scene**: time of day, lighting conditions, weather

**Quality**: "highly detailed", "8K quality", "cinematic composition"

**Rules when optimizing**:
- Use English keywords (models trained on English)
- Show the optimized prompt transparently before submitting
- Keep the user's core intent unchanged
- Do not over-stack terminology or add unwanted elements

</instructions>

<examples>

<example name="podcast-request">
<user>Make a podcast about the latest AI developments</user>
<response>
→ Got it! Preparing two-person podcast...
  Topic: Latest AI developments

→ Generation submitted, about 2-3 minutes

  You can:
  • Wait and ask "done yet?"
  • Check listenhub.ai/app/library
</response>
</example>

<example name="explainer-request">
<user>Create an explainer video introducing Claude Code</user>
<response>
→ Got it! Preparing explainer video...
  Topic: Claude Code introduction

→ Generation submitted, explainer videos take 3-5 minutes

  Includes: Script + narration + AI visuals
</response>
</example>

<example name="tts-request">
<user>Convert this article to speech https://blog.example.com/article</user>
<response>
→ Got it! Parsing article...

→ TTS submitted, about 1-2 minutes

  Wait a moment, or ask "done yet?" to check
</response>
</example>

<example name="image-generation-short-prompt">
<user>Generate an image: cyberpunk city at night</user>
<response>
→ Short prompt detected. Would you like help enriching it with style/lighting/composition details, or use it as-is?
</response>
</example>

<example name="image-generation-detailed-prompt">
<user>Generate an image: "Cyberpunk city at night, neon lights reflecting on wet streets, towering skyscrapers with holographic ads, flying vehicles, cinematic composition, highly detailed, 8K quality"</user>
<response>
→ Generating image...

  Prompt: Cyberpunk city at night, neon lights reflecting on wet streets,
   towering skyscrapers with holographic ads, flying vehicles,
   cinematic composition, highly detailed, 8K quality

  Resolution: 2K (16:9)

✓ Image generated!
  ~/Downloads/labnana-20260121-143145.jpg
</response>
</example>

<example name="image-with-reference">
<user>Generate an image in this style: https://example.com/style-ref.jpg, prompt: "a futuristic car"</user>
<response>
→ Generating image with reference...

  Prompt: a futuristic car
  Reference images: 1
  Reference image URL: https://example.com/style-ref.jpg
  Resolution: 2K (16:9)

✓ Image generated!
  ~/Downloads/labnana-20260122-154230.jpg
</response>
</example>

<example name="status-check">
<user>Done yet?</user>
<response>
✓ Podcast generated!

  "AI Revolution: From GPT to AGI"

  Listen: https://listenhub.ai/app/podcast

  Duration: ~8 minutes

  Need to download? Just say so.
</response>
</example>

</examples>


---

## Skill Companion Files

> Additional files collected from the skill directory layout.

### _meta.json

```json
{
  "owner": "0xfango",
  "slug": "listenhub-2",
  "displayName": "Listenhub",
  "latest": {
    "version": "0.1.0",
    "publishedAt": 1772333169933,
    "commit": "https://github.com/openclaw/skills/commit/0d4139eb413f681301633349353c4ec0e979da6d"
  },
  "history": []
}

```

### scripts/check-status.sh

```bash
#!/usr/bin/env bash
# Check episode status via ListenHub API
# Usage: ./check-status.sh --episode <episode-id> --type podcast|flow-speech|explainer [--wait] [--max-polls N] [--timeout S] [--interval S]
#
# Without --wait: single-shot query (backward compatible)
# With --wait: polls until terminal status, respecting limits
#
# Exit codes:
#   0 = success (or single-shot completed)
#   1 = generation failed / error
#   2 = timeout or rate-limited (still pending, safe to retry after a short wait)

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "${SCRIPT_DIR}/lib.sh"

EPISODE_ID=""
TYPE="podcast"
WAIT_MODE=false
MAX_POLLS=30
TIMEOUT=300
INTERVAL=10

usage() {
  cat >&2 <<'EOF'
Usage: ./check-status.sh --episode <episode-id> --type podcast|flow-speech|tts|explainer [--wait] [--max-polls N] [--timeout S] [--interval S]

Options:
  --wait          Enable polling mode (wait for completion)
  --max-polls N   Maximum poll attempts (default: 30)
  --timeout S     Maximum total wait in seconds (default: 300)
  --interval S    Base poll interval in seconds (default: 10)

Exit codes:
  0 = completed successfully
  1 = generation failed or error
  2 = timeout (still pending after limits reached)
EOF
}

while [ $# -gt 0 ]; do
  case "$1" in
    --episode)
      EPISODE_ID="${2:-}"
      shift 2
      ;;
    --type)
      TYPE="${2:-podcast}"
      shift 2
      ;;
    --wait)
      WAIT_MODE=true
      shift
      ;;
    --max-polls)
      MAX_POLLS="${2:-30}"
      shift 2
      ;;
    --timeout)
      TIMEOUT="${2:-300}"
      shift 2
      ;;
    --interval)
      INTERVAL="${2:-10}"
      shift 2
      ;;
    --help)
      usage
      exit 0
      ;;
    *)
      echo "Error: Unknown argument $1" >&2
      usage
      exit 1
      ;;
  esac
done

if [ -z "$EPISODE_ID" ]; then
  echo "Error: --episode is required" >&2
  usage
  exit 1
fi

validate_id "$EPISODE_ID" "episode-id"

case "$TYPE" in
  podcast)
    ENDPOINT="podcast/episodes/${EPISODE_ID}"
    ;;
  explainer)
    ENDPOINT="storybook/episodes/${EPISODE_ID}"
    ;;
  flow-speech|tts)
    ENDPOINT="flow-speech/episodes/${EPISODE_ID}"
    ;;
  *)
    echo "Error: Invalid type '$TYPE'. Must be: podcast | flow-speech | tts | explainer" >&2
    exit 1
    ;;
esac

# Single-shot mode (default, backward compatible)
if [ "$WAIT_MODE" = false ]; then
  api_get "$ENDPOINT"
  exit $?
fi

# Polling mode
check_jq

START_TIME=$(date +%s)
POLL_COUNT=0

while true; do
  POLL_COUNT=$((POLL_COUNT + 1))

  # Check poll limit
  if [ $POLL_COUNT -gt $MAX_POLLS ]; then
    echo "Error: Max polls ($MAX_POLLS) reached. Episode still processing." >&2
    exit 2
  fi

  # Check timeout
  ELAPSED=$(( $(date +%s) - START_TIME ))
  if [ $ELAPSED -ge $TIMEOUT ]; then
    echo "Error: Timeout (${TIMEOUT}s) reached. Episode still processing." >&2
    exit 2
  fi

  # Fetch status; catch transient network errors (curl failures) and retry
  if ! RESPONSE=$(api_get "$ENDPOINT"); then
    echo "Poll $POLL_COUNT: network error, retrying in ${INTERVAL}s" >&2
    sleep "$INTERVAL"
    continue
  fi

  # Rate-limited — exit and let the calling agent decide when to retry
  RESP_CODE=$(echo "$RESPONSE" | jq -r '.code // 0' 2>/dev/null || echo "0")
  if [ "$RESP_CODE" = "429" ] || [ "$RESP_CODE" = "25429" ]; then
    echo "Error: Rate limited (429). Retry after a short wait." >&2
    exit 2
  fi

  # Check process status (default to "unknown" if response is not valid JSON)
  STATUS=$(echo "$RESPONSE" | jq -r '.data.processStatus // "unknown"' 2>/dev/null || echo "unknown")

  case "$STATUS" in
    success|completed)
      echo "$RESPONSE"
      exit 0
      ;;
    failed|error)
      echo "$RESPONSE" >&2
      exit 1
      ;;
    *)
      REMAINING=$((TIMEOUT - ELAPSED))
      echo "Poll $POLL_COUNT: status=$STATUS, elapsed=${ELAPSED}s, remaining=${REMAINING}s" >&2
      sleep "$INTERVAL"
      ;;
  esac
done

```

### scripts/create-explainer.sh

```bash
#!/usr/bin/env bash
# Create explainer video via ListenHub API
# Usage: ./create-explainer.sh --content <text> --language zh|en --mode info|story --speakers <id>

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "${SCRIPT_DIR}/lib.sh"

CONTENT=""
LANGUAGE=""
MODE="info"
SPEAKERS=""

usage() {
  cat >&2 <<'EOF'
Usage: ./create-explainer.sh --content <text> --language zh|en --mode info|story --speakers <id>

Examples:
  ./create-explainer.sh --content "介绍产品功能" --language zh --mode info --speakers cozy-man-english
EOF
}

while [ $# -gt 0 ]; do
  case "$1" in
    --content)
      CONTENT="${2:-}"
      shift 2
      ;;
    --language|--lang)
      LANGUAGE="${2:-}"
      shift 2
      ;;
    --mode)
      MODE="${2:-info}"
      shift 2
      ;;
    --speakers)
      SPEAKERS="${2:-}"
      shift 2
      ;;
    --help)
      usage
      exit 0
      ;;
    *)
      echo "Error: Unknown argument $1" >&2
      usage
      exit 1
      ;;
  esac
done

if [ -z "$CONTENT" ] || [ -z "$LANGUAGE" ] || [ -z "$SPEAKERS" ]; then
  echo "Error: --content, --language, and --speakers are required" >&2
  usage
  exit 1
fi

if [[ ! "$LANGUAGE" =~ ^(zh|en)$ ]]; then
  echo "Error: language must be zh or en" >&2
  exit 1
fi

if [[ ! "$MODE" =~ ^(info|story)$ ]]; then
  echo "Error: mode must be info or story" >&2
  exit 1
fi

check_jq

SPEAKER_IDS=()
IFS=',' read -r -a SPEAKER_ITEMS <<< "$SPEAKERS"
for speaker_item in "${SPEAKER_ITEMS[@]}"; do
  speaker_item=$(trim_ws "$speaker_item")
  if [ -n "$speaker_item" ]; then
    SPEAKER_IDS+=("$speaker_item")
  fi
done
if [ ${#SPEAKER_IDS[@]} -ne 1 ]; then
  echo "Error: speakers must contain 1 item" >&2
  exit 1
fi

CONTENT_JSON=$(jq -n --arg c "$CONTENT" '$c')
SPEAKERS_JSON=$(printf '%s\n' "${SPEAKER_IDS[@]}" | jq -R '{speakerId: .}' | jq -s '.')
BODY=$(jq -n \
  --argjson content "$CONTENT_JSON" \
  --argjson speakers "$SPEAKERS_JSON" \
  --arg lang "$LANGUAGE" \
  --arg mode "$MODE" \
  '{sources: [{type: "text", content: $content}], speakers: $speakers, language: $lang, mode: $mode}')

api_post "storybook/episodes" "$BODY"

```

### scripts/create-podcast-audio.sh

```bash
#!/usr/bin/env bash
# Generate audio from podcast text content (Stage 2 of two-stage generation)
# Usage: ./create-podcast-audio.sh --episode <episode-id> [--scripts <scripts_json_file>]
#
# Examples:
#   # Use original scripts
#   ./create-podcast-audio.sh --episode "68d699ebc4b373bd1ae50dde"
#
#   # Use modified scripts
#   ./create-podcast-audio.sh --episode "68d699ebc4b373bd1ae50dde" --scripts modified-scripts.json
#
# modified-scripts.json format:
# {
#   "scripts": [
#     {"content": "Hello everyone", "speakerId": "CN-Man-Beijing-V2"},
#     {"content": "Welcome", "speakerId": "chat-girl-105-cn"}
#   ]
# }

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "${SCRIPT_DIR}/lib.sh"

EPISODE_ID=""
SCRIPTS_FILE=""

usage() {
  cat >&2 <<'EOF'
Usage: ./create-podcast-audio.sh --episode <episode-id> [--scripts <scripts_json_file>]

Examples:
  # Use original scripts
  ./create-podcast-audio.sh --episode "68d699ebc4b373bd1ae50dde"

  # Use modified scripts
  ./create-podcast-audio.sh --episode "68d699ebc4b373bd1ae50dde" --scripts modified-scripts.json

  # Use stdin for scripts
  echo '{"scripts":[{"content":"Hello","speakerId":"cozy-man-english"}]}' | ./create-podcast-audio.sh --episode "68d699ebc4b373bd1ae50dde" --scripts -

modified-scripts.json format:
{
  "scripts": [
    {"content": "Hello everyone", "speakerId": "CN-Man-Beijing-V2"},
    {"content": "Welcome", "speakerId": "chat-girl-105-cn"}
  ]
}
EOF
}

while [ $# -gt 0 ]; do
  case "$1" in
    --episode)
      EPISODE_ID="${2:-}"
      shift 2
      ;;
    --scripts)
      SCRIPTS_FILE="${2:-}"
      shift 2
      ;;
    --help)
      usage
      exit 0
      ;;
    *)
      echo "Error: Unknown argument $1" >&2
      usage
      exit 1
      ;;
  esac
done

if [ -z "$EPISODE_ID" ]; then
  echo "Error: --episode is required" >&2
  usage
  exit 1
fi

validate_id "$EPISODE_ID" "episode-id"

# Build request body
if [ -n "$SCRIPTS_FILE" ]; then
  if [ "$SCRIPTS_FILE" = "-" ]; then
    BODY=$(cat)
  else
    if [ ! -f "$SCRIPTS_FILE" ]; then
      echo "Error: File not found: $SCRIPTS_FILE" >&2
      exit 1
    fi
    BODY=$(cat "$SCRIPTS_FILE")
  fi

  # Validate JSON format
  if command -v jq &>/dev/null; then
    if ! echo "$BODY" | jq empty 2>/dev/null; then
      echo "Error: Invalid JSON format" >&2
      exit 1
    fi
  fi
else
  BODY="{}"
fi

api_post "podcast/episodes/${EPISODE_ID}/audio" "$BODY"

```

### scripts/create-podcast-text.sh

```bash
#!/usr/bin/env bash
# Create podcast text content (Stage 1 of two-stage generation)
# Usage: ./create-podcast-text.sh --query <text> --language zh|en --mode quick|deep|debate --speakers <id1,id2> [--source-url <url>] [--source-text <text>]
#
# Examples:
#   ./create-podcast-text.sh --query "AI 的未来发展" --language zh --mode deep --speakers cozy-man-english
#   ./create-podcast-text.sh --query "分析这篇文章" --language en --mode deep --speakers cozy-man-english --source-url "https://blog.example.com/article"

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "${SCRIPT_DIR}/lib.sh"

QUERY=""
LANGUAGE=""
MODE="quick"
SPEAKERS=""
SOURCE_URLS=()
SOURCE_TEXTS=()

usage() {
  cat >&2 <<'EOF'
Usage: ./create-podcast-text.sh --query <text> --language zh|en --mode quick|deep|debate --speakers <id1,id2> [--source-url <url>] [--source-text <text>]

Examples:
  ./create-podcast-text.sh --query "AI 的未来发展" --language zh --mode deep --speakers cozy-man-english
  ./create-podcast-text.sh --query "分析这篇文章" --language en --mode deep --speakers cozy-man-english --source-url "https://blog.example.com/article"

Note: This only generates text content. Use create-podcast-audio.sh to generate audio.
EOF
}

while [ $# -gt 0 ]; do
  case "$1" in
    --query)
      QUERY="${2:-}"
      shift 2
      ;;
    --language|--lang)
      LANGUAGE="${2:-}"
      shift 2
      ;;
    --mode)
      MODE="${2:-quick}"
      shift 2
      ;;
    --speakers)
      SPEAKERS="${2:-}"
      shift 2
      ;;
    --source-url)
      SOURCE_URLS+=("${2:-}")
      shift 2
      ;;
    --source-text)
      SOURCE_TEXTS+=("${2:-}")
      shift 2
      ;;
    --help)
      usage
      exit 0
      ;;
    *)
      echo "Error: Unknown argument $1" >&2
      usage
      exit 1
      ;;
  esac
done

if [ -z "$QUERY" ] || [ -z "$LANGUAGE" ] || [ -z "$SPEAKERS" ]; then
  echo "Error: --query, --language, and --speakers are required" >&2
  usage
  exit 1
fi

check_jq

if [[ ! "$LANGUAGE" =~ ^(zh|en)$ ]]; then
  echo "Error: language must be zh or en" >&2
  exit 1
fi

if [[ ! "$MODE" =~ ^(quick|deep|debate)$ ]]; then
  echo "Error: mode must be quick, deep, or debate" >&2
  exit 1
fi

SPEAKER_IDS=()
IFS=',' read -r -a SPEAKER_ITEMS <<< "$SPEAKERS"
for speaker_item in "${SPEAKER_ITEMS[@]}"; do
  speaker_item=$(trim_ws "$speaker_item")
  if [ -n "$speaker_item" ]; then
    SPEAKER_IDS+=("$speaker_item")
  fi
done
if [ ${#SPEAKER_IDS[@]} -lt 1 ] || [ ${#SPEAKER_IDS[@]} -gt 2 ]; then
  echo "Error: speakers must contain 1-2 items" >&2
  exit 1
fi

if [ "$MODE" = "debate" ] && [ ${#SPEAKER_IDS[@]} -ne 2 ]; then
  echo "Error: debate mode requires 2 speakers" >&2
  exit 1
fi

SOURCE_URLS_CLEAN=()
if [ ${#SOURCE_URLS[@]} -gt 0 ]; then
  for url in "${SOURCE_URLS[@]}"; do
    url=$(trim_ws "$url")
    if [ -n "$url" ]; then
      SOURCE_URLS_CLEAN+=("$url")
    fi
  done
fi
SOURCE_TEXTS_CLEAN=()
if [ ${#SOURCE_TEXTS[@]} -gt 0 ]; then
  for text in "${SOURCE_TEXTS[@]}"; do
    text=$(trim_ws "$text")
    if [ -n "$text" ]; then
      SOURCE_TEXTS_CLEAN+=("$text")
    fi
  done
fi

QUERY_JSON=$(jq -n --arg q "$QUERY" '$q')
SPEAKERS_JSON=$(printf '%s\n' "${SPEAKER_IDS[@]}" | jq -R '{speakerId: .}' | jq -s '.')
SOURCES_JSON="[]"
if [ ${#SOURCE_URLS_CLEAN[@]} -gt 0 ] || [ ${#SOURCE_TEXTS_CLEAN[@]} -gt 0 ]; then
  URL_JSON="[]"
  if [ ${#SOURCE_URLS_CLEAN[@]} -gt 0 ]; then
    URL_JSON=$(printf '%s\0' "${SOURCE_URLS_CLEAN[@]}" | jq -Rs 'split("\u0000")[:-1] | map({type: "url", content: .})')
  fi
  TEXT_JSON="[]"
  if [ ${#SOURCE_TEXTS_CLEAN[@]} -gt 0 ]; then
    TEXT_JSON=$(printf '%s\0' "${SOURCE_TEXTS_CLEAN[@]}" | jq -Rs 'split("\u0000")[:-1] | map({type: "text", content: .})')
  fi
  SOURCES_JSON=$(jq -n --argjson urls "$URL_JSON" --argjson texts "$TEXT_JSON" '$urls + $texts')
fi

if [ "$(echo "$SOURCES_JSON" | jq 'length')" -gt 0 ]; then
  BODY=$(jq -n \
    --argjson query "$QUERY_JSON" \
    --argjson speakers "$SPEAKERS_JSON" \
    --arg lang "$LANGUAGE" \
    --arg mode "$MODE" \
    --argjson sources "$SOURCES_JSON" \
    '{query: $query, speakers: $speakers, language: $lang, mode: $mode, sources: $sources}')
else
  BODY=$(jq -n \
    --argjson query "$QUERY_JSON" \
    --argjson speakers "$SPEAKERS_JSON" \
    --arg lang "$LANGUAGE" \
    --arg mode "$MODE" \
    '{query: $query, speakers: $speakers, language: $lang, mode: $mode}')
fi

api_post "podcast/episodes/text-content" "$BODY"

```

### scripts/create-podcast.sh

```bash
#!/usr/bin/env bash
# Create podcast episode via ListenHub API
# Usage: ./create-podcast.sh --query <text> --language zh|en --mode quick|deep|debate --speakers <id1,id2> [--source-url <url>] [--source-text <text>]
#
# Examples:
#   ./create-podcast.sh --query "AI 的未来发展" --language zh --mode deep --speakers cozy-man-english
#   ./create-podcast.sh --query "讨论远程工作的利弊" --language en --mode debate --speakers cozy-man-english,travel-girl-english
#   ./create-podcast.sh --query "分析这篇文章" --language en --mode deep --speakers cozy-man-english --source-url "https://blog.example.com/article"

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "${SCRIPT_DIR}/lib.sh"

QUERY=""
LANGUAGE=""
MODE="quick"
SPEAKERS=""
SOURCE_URLS=()
SOURCE_TEXTS=()

usage() {
  cat >&2 <<'EOF'
Usage: ./create-podcast.sh --query <text> --language zh|en --mode quick|deep|debate --speakers <id1,id2> [--source-url <url>] [--source-text <text>]

Examples:
  ./create-podcast.sh --query "AI 的未来发展" --language zh --mode deep --speakers cozy-man-english
  ./create-podcast.sh --query "讨论远程工作的利弊" --language en --mode debate --speakers cozy-man-english,travel-girl-english
  ./create-podcast.sh --query "分析这篇文章" --language en --mode deep --speakers cozy-man-english --source-url "https://blog.example.com/article"
EOF
}

while [ $# -gt 0 ]; do
  case "$1" in
    --query)
      QUERY="${2:-}"
      shift 2
      ;;
    --language|--lang)
      LANGUAGE="${2:-}"
      shift 2
      ;;
    --mode)
      MODE="${2:-quick}"
      shift 2
      ;;
    --speakers)
      SPEAKERS="${2:-}"
      shift 2
      ;;
    --source-url)
      SOURCE_URLS+=("${2:-}")
      shift 2
      ;;
    --source-text)
      SOURCE_TEXTS+=("${2:-}")
      shift 2
      ;;
    --help)
      usage
      exit 0
      ;;
    *)
      echo "Error: Unknown argument $1" >&2
      usage
      exit 1
      ;;
  esac
done

if [ -z "$QUERY" ] || [ -z "$LANGUAGE" ] || [ -z "$SPEAKERS" ]; then
  echo "Error: --query, --language, and --speakers are required" >&2
  usage
  exit 1
fi

check_jq

if [[ ! "$LANGUAGE" =~ ^(zh|en)$ ]]; then
  echo "Error: language must be zh or en" >&2
  exit 1
fi

if [[ ! "$MODE" =~ ^(quick|deep|debate)$ ]]; then
  echo "Error: mode must be quick, deep, or debate" >&2
  exit 1
fi

SPEAKER_IDS=()
IFS=',' read -r -a SPEAKER_ITEMS <<< "$SPEAKERS"
for speaker_item in "${SPEAKER_ITEMS[@]}"; do
  speaker_item=$(trim_ws "$speaker_item")
  if [ -n "$speaker_item" ]; then
    SPEAKER_IDS+=("$speaker_item")
  fi
done
if [ ${#SPEAKER_IDS[@]} -lt 1 ] || [ ${#SPEAKER_IDS[@]} -gt 2 ]; then
  echo "Error: speakers must contain 1-2 items" >&2
  exit 1
fi

if [ "$MODE" = "debate" ] && [ ${#SPEAKER_IDS[@]} -ne 2 ]; then
  echo "Error: debate mode requires 2 speakers" >&2
  exit 1
fi

SOURCE_URLS_CLEAN=()
if [ ${#SOURCE_URLS[@]} -gt 0 ]; then
  for url in "${SOURCE_URLS[@]}"; do
    url=$(trim_ws "$url")
    if [ -n "$url" ]; then
      SOURCE_URLS_CLEAN+=("$url")
    fi
  done
fi
SOURCE_TEXTS_CLEAN=()
if [ ${#SOURCE_TEXTS[@]} -gt 0 ]; then
  for text in "${SOURCE_TEXTS[@]}"; do
    text=$(trim_ws "$text")
    if [ -n "$text" ]; then
      SOURCE_TEXTS_CLEAN+=("$text")
    fi
  done
fi

QUERY_JSON=$(jq -n --arg q "$QUERY" '$q')
SPEAKERS_JSON=$(printf '%s\n' "${SPEAKER_IDS[@]}" | jq -R '{speakerId: .}' | jq -s '.')
SOURCES_JSON="[]"
if [ ${#SOURCE_URLS_CLEAN[@]} -gt 0 ] || [ ${#SOURCE_TEXTS_CLEAN[@]} -gt 0 ]; then
  URL_JSON="[]"
  if [ ${#SOURCE_URLS_CLEAN[@]} -gt 0 ]; then
    URL_JSON=$(printf '%s\0' "${SOURCE_URLS_CLEAN[@]}" | jq -Rs 'split("\u0000")[:-1] | map({type: "url", content: .})')
  fi
  TEXT_JSON="[]"
  if [ ${#SOURCE_TEXTS_CLEAN[@]} -gt 0 ]; then
    TEXT_JSON=$(printf '%s\0' "${SOURCE_TEXTS_CLEAN[@]}" | jq -Rs 'split("\u0000")[:-1] | map({type: "text", content: .})')
  fi
  SOURCES_JSON=$(jq -n --argjson urls "$URL_JSON" --argjson texts "$TEXT_JSON" '$urls + $texts')
fi

if [ "$(echo "$SOURCES_JSON" | jq 'length')" -gt 0 ]; then
  BODY=$(jq -n \
    --argjson query "$QUERY_JSON" \
    --argjson speakers "$SPEAKERS_JSON" \
    --arg lang "$LANGUAGE" \
    --arg mode "$MODE" \
    --argjson sources "$SOURCES_JSON" \
    '{query: $query, speakers: $speakers, language: $lang, mode: $mode, sources: $sources}')
else
  BODY=$(jq -n \
    --argjson query "$QUERY_JSON" \
    --argjson speakers "$SPEAKERS_JSON" \
    --arg lang "$LANGUAGE" \
    --arg mode "$MODE" \
    '{query: $query, speakers: $speakers, language: $lang, mode: $mode}')
fi

api_post "podcast/episodes" "$BODY"

```

### scripts/create-speech.sh

```bash
#!/usr/bin/env bash
# Create multi-speaker audio from scripts via ListenHub API
# Usage: ./create-speech.sh --scripts <scripts_json_file|->
#
# Example:
#   ./create-speech.sh --scripts scripts.json
#
# scripts.json format:
# {
#   "scripts": [
#     {"content": "Hello everyone", "speakerId": "cozy-man-english"},
#     {"content": "Welcome to the show", "speakerId": "travel-girl-english"}
#   ]
# }

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "${SCRIPT_DIR}/lib.sh"

SCRIPTS_FILE=""

usage() {
  cat >&2 <<'EOF'
Usage: ./create-speech.sh --scripts <scripts_json_file|->

Example:
  ./create-speech.sh --scripts scripts.json

scripts.json format:
{
  "scripts": [
    {"content": "Hello everyone", "speakerId": "cozy-man-english"},
    {"content": "Welcome to the show", "speakerId": "travel-girl-english"}
  ]
}

Or use inline JSON:
  echo '{"scripts":[{"content":"Hello","speakerId":"cozy-man-english"}]}' | ./create-speech.sh --scripts -
EOF
}

while [ $# -gt 0 ]; do
  case "$1" in
    --scripts)
      SCRIPTS_FILE="${2:-}"
      shift 2
      ;;
    --help)
      usage
      exit 0
      ;;
    *)
      echo "Error: Unknown argument $1" >&2
      usage
      exit 1
      ;;
  esac
done

if [ -z "$SCRIPTS_FILE" ]; then
  echo "Error: --scripts is required" >&2
  usage
  exit 1
fi

check_jq

# Read scripts from file or stdin
if [ "$SCRIPTS_FILE" = "-" ]; then
  BODY=$(cat)
else
  if [ ! -f "$SCRIPTS_FILE" ]; then
    echo "Error: File not found: $SCRIPTS_FILE" >&2
    exit 1
  fi
  BODY=$(cat "$SCRIPTS_FILE")
fi

# Validate JSON format
if ! echo "$BODY" | jq empty 2>/dev/null; then
  echo "Error: Invalid JSON format" >&2
  exit 1
fi
if ! echo "$BODY" | jq -e '
  (.scripts | type == "array") and
  ((.scripts | length) > 0) and
  (all(.scripts[]; (.content | type == "string" and length > 0) and (.speakerId | type == "string" and length > 0)))
' >/dev/null 2>&1; then
  echo "Error: Invalid scripts structure (require scripts[].content and scripts[].speakerId)" >&2
  exit 1
fi

api_post "speech" "$BODY"

```

### scripts/create-tts.sh

```bash
#!/usr/bin/env bash
# Create FlowSpeech audio via ListenHub API
# Usage: ./create-tts.sh --type text|url --content <text|url> --language zh|en --mode smart|direct --speakers <id>
#
# Examples:
#   ./create-tts.sh --type text --content "欢迎使用 ListenHub 音频生成服务" --language zh --mode smart --speakers cozy-man-english
#   ./create-tts.sh --type url --content "https://example.com/article.html" --language en --mode smart --speakers cozy-man-english

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "${SCRIPT_DIR}/lib.sh"

TYPE=""
CONTENT=""
LANGUAGE=""
MODE="direct"
SPEAKERS=""

usage() {
  cat >&2 <<'EOF'
Usage: ./create-tts.sh --type text|url --content <text|url> --language zh|en --mode smart|direct --speakers <id>

Examples:
  ./create-tts.sh --type text --content "欢迎使用 ListenHub 音频生成服务" --language zh --mode smart --speakers cozy-man-english
  ./create-tts.sh --type url --content "https://example.com/article.html" --language en --mode smart --speakers cozy-man-english

Notes:
  - Text input length limit: 10000 characters
EOF
}

while [ $# -gt 0 ]; do
  case "$1" in
    --type)
      TYPE="${2:-}"
      shift 2
      ;;
    --content)
      CONTENT="${2:-}"
      shift 2
      ;;
    --language|--lang)
      LANGUAGE="${2:-}"
      shift 2
      ;;
    --mode)
      MODE="${2:-direct}"
      shift 2
      ;;
    --speakers)
      SPEAKERS="${2:-}"
      shift 2
      ;;
    --help)
      usage
      exit 0
      ;;
    *)
      echo "Error: Unknown argument $1" >&2
      usage
      exit 1
      ;;
  esac
done

if [ -z "$TYPE" ] || [ -z "$CONTENT" ] || [ -z "$LANGUAGE" ] || [ -z "$SPEAKERS" ]; then
  echo "Error: --type, --content, --language, and --speakers are required" >&2
  usage
  exit 1
fi

if [[ ! "$TYPE" =~ ^(text|url)$ ]]; then
  echo "Error: type must be text or url" >&2
  exit 1
fi

if [[ ! "$LANGUAGE" =~ ^(zh|en)$ ]]; then
  echo "Error: language must be zh or en" >&2
  exit 1
fi

if [[ ! "$MODE" =~ ^(smart|direct)$ ]]; then
  echo "Error: mode must be smart or direct" >&2
  exit 1
fi

check_jq

if [ "$TYPE" = "text" ]; then
  CONTENT_LEN=$(printf '%s' "$CONTENT" | wc -m | tr -d ' ')
  if [ "$CONTENT_LEN" -gt 10000 ]; then
    echo "Error: text content exceeds 10000 characters" >&2
    exit 1
  fi
fi

SPEAKER_IDS=()
IFS=',' read -r -a SPEAKER_ITEMS <<< "$SPEAKERS"
for speaker_item in "${SPEAKER_ITEMS[@]}"; do
  speaker_item=$(trim_ws "$speaker_item")
  if [ -n "$speaker_item" ]; then
    SPEAKER_IDS+=("$speaker_item")
  fi
done
if [ ${#SPEAKER_IDS[@]} -ne 1 ]; then
  echo "Error: speakers must contain 1 item" >&2
  exit 1
fi

CONTENT_JSON=$(jq -n --arg c "$CONTENT" '$c')
SPEAKERS_JSON=$(printf '%s\n' "${SPEAKER_IDS[@]}" | jq -R '{speakerId: .}' | jq -s '.')
BODY=$(jq -n \
  --argjson content "$CONTENT_JSON" \
  --arg type "$TYPE" \
  --argjson speakers "$SPEAKERS_JSON" \
  --arg lang "$LANGUAGE" \
  --arg mode "$MODE" \
  '{sources: [{type: $type, content: $content}], speakers: $speakers, language: $lang, mode: $mode}')

api_post "flow-speech/episodes" "$BODY"

```

### scripts/generate-image.sh

```bash
#!/usr/bin/env bash
set -euo pipefail

# ============================================
# Labnana Image Generation Script
# API: https://docs.marswave.ai/openapi-labnana.html
# Platform: macOS, Linux, Windows (Git Bash/WSL)
# ============================================

PROMPT=""
SIZE="2K"
RATIO="16:9"
REFERENCE_IMAGES=""

if [ $# -gt 0 ] && [[ "$1" != --* ]]; then
  PROMPT="${1:-}"
  SIZE="${2:-2K}"
  RATIO="${3:-16:9}"
  REFERENCE_IMAGES="${4:-}"
else
  while [ $# -gt 0 ]; do
    case "$1" in
      --prompt)
        PROMPT="${2:-}"
        shift 2
        ;;
      --size)
        SIZE="${2:-2K}"
        shift 2
        ;;
      --ratio)
        RATIO="${2:-16:9}"
        shift 2
        ;;
      --reference-images)
        REFERENCE_IMAGES="${2:-}"
        shift 2
        ;;
      --help)
        echo "Usage: $0 \"<prompt>\" [size] [ratio] [reference_images]" >&2
        echo "  size: 1K | 2K | 4K (default: 2K)" >&2
        echo "  ratio: 16:9 | 1:1 | 9:16 | 2:3 | 3:2 | 3:4 | 4:3 | 21:9 (default: 16:9)" >&2
        echo "  reference_images: comma-separated URLs (max 14), e.g. \"url1,url2\"" >&2
        exit 0
        ;;
      *)
        echo "Error: Unknown argument $1" >&2
        exit 1
        ;;
    esac
  done
fi

# Configuration
API_ENDPOINT="https://api.labnana.com/openapi/v1/images/generation"
AGENT_SKILLS_CLIENT_ID="PJBkELS1o_q9nJ~NzF2_Fmr21TNX&~eoJR49FFdFhD3U"
MAX_RETRIES=3
INITIAL_TIMEOUT=600
RETRY_DELAY=5

# Temp file tracking for interrupt cleanup
TEMP_OUTPUT_FILE=""

# ============================================
# Platform detection
# ============================================

detect_platform() {
  case "$(uname -s)" in
    Darwin*)  echo "macos" ;;
    Linux*)   echo "linux" ;;
    CYGWIN*|MINGW*|MSYS*) echo "windows" ;;
    *)        echo "unknown" ;;
  esac
}

PLATFORM=$(detect_platform)

# ============================================
# Interrupt cleanup
# ============================================

cleanup() {
  if [ -n "$TEMP_OUTPUT_FILE" ] && [ -f "$TEMP_OUTPUT_FILE" ]; then
    rm -f "$TEMP_OUTPUT_FILE"
  fi
}

trap cleanup EXIT INT TERM

# ============================================
# Cross-platform utility functions
# ============================================

# Cross-platform sed -i (macOS and Linux syntax differs)
sed_inplace() {
  local file="$1"
  local pattern="$2"

  if [ "$PLATFORM" = "macos" ]; then
    sed -i '' "$pattern" "$file"
  else
    sed -i "$pattern" "$file"
  fi
}

# Cross-platform base64 decode
base64_decode() {
  local input="$1"
  local output="$2"

  # Try different base64 decode methods
  if echo "$input" | base64 -d > "$output" 2>/dev/null; then
    return 0
  elif echo "$input" | base64 -D > "$output" 2>/dev/null; then
    # macOS old version uses -D
    return 0
  elif echo "$input" | base64 --decode > "$output" 2>/dev/null; then
    # Some systems use --decode
    return 0
  elif command -v openssl &>/dev/null; then
    # Fallback: use openssl
    echo "$input" | openssl base64 -d > "$output" 2>/dev/null
    return $?
  else
    return 1
  fi
}

# Cross-platform curl finder
find_curl() {
  # Try different paths by priority
  local curl_paths=(
    "/usr/bin/curl"           # macOS, Linux standard path
    "/bin/curl"               # Some Linux distributions
    "/usr/local/bin/curl"     # Homebrew (macOS)
    "/mingw64/bin/curl"       # Git Bash (Windows)
    "/c/Windows/System32/curl.exe"  # Windows built-in curl
  )

  for path in "${curl_paths[@]}"; do
    if [ -x "$path" ]; then
      echo "$path"
      return 0
    fi
  done

  # Finally try curl in PATH
  if command -v curl &>/dev/null; then
    command -v curl
    return 0
  fi

  return 1
}

# Cross-platform shell config file getter
get_shell_rc() {
  local rc_files=()

  case "$PLATFORM" in
    macos)
      # macOS defaults to zsh, but may use bash
      rc_files=(~/.zshrc ~/.bash_profile ~/.bashrc ~/.profile)
      ;;
    linux)
      # Linux typically uses bashrc
      rc_files=(~/.bashrc ~/.zshrc ~/.profile)
      ;;
    windows)
      # Git Bash uses bashrc
      rc_files=(~/.bashrc ~/.bash_profile ~/.profile)
      ;;
    *)
      rc_files=(~/.bashrc ~/.zshrc ~/.profile)
      ;;
  esac

  # Return first existing file, or first as default
  for rc in "${rc_files[@]}"; do
    if [ -f "$rc" ]; then
      echo "$rc"
      return 0
    fi
  done

  # If none exist, return platform default
  case "$PLATFORM" in
    macos)  echo ~/.zshrc ;;
    *)      echo ~/.bashrc ;;
  esac
}

# Generate random number (compatible with environments without $RANDOM)
get_random() {
  if [ -n "${RANDOM:-}" ]; then
    echo $((RANDOM % 10000))
  elif [ -f /dev/urandom ]; then
    od -An -tu2 -N2 /dev/urandom | tr -d ' '
  else
    # Last fallback: use part of timestamp
    date +%N 2>/dev/null | cut -c1-4 || echo "0000"
  fi
}

# ============================================
# Environment variable loading (supports multiple formats)
# ============================================

load_env_var() {
  local var_name="$1"
  local rc_file
  local line
  local value=""

  for rc_file in ~/.zshrc ~/.bashrc ~/.bash_profile ~/.profile; do
    if [ -f "$rc_file" ]; then
      # Find export VAR=... lines (supports =, =", =' formats)
      line=$(grep -E "^export ${var_name}=" "$rc_file" 2>/dev/null | tail -1) || true
      if [ -n "$line" ]; then
        # Extract part after equals sign
        value="${line#*=}"
        # Remove leading/trailing quotes (double or single)
        value="${value#\"}"
        value="${value%\"}"
        value="${value#\'}"
        value="${value%\'}"
        # Expand $HOME and ~
        value="${value/\$HOME/$HOME}"
        value="${value/#\~/$HOME}"
        if [ -n "$value" ]; then
          export "$var_name"="$value"
          return 0
        fi
      fi
    fi
  done
  return 1
}

[ -z "${LISTENHUB_API_KEY:-}" ] && load_env_var "LISTENHUB_API_KEY" || true
[ -z "${LISTENHUB_OUTPUT_DIR:-}" ] && load_env_var "LISTENHUB_OUTPUT_DIR" || true

# ============================================
# Dependency check and installation guide
# ============================================

check_dependencies() {
  local missing_deps=()
  local install_cmd=""

  # Check jq
  if ! command -v jq &>/dev/null; then
    missing_deps+=("jq")
  fi

  # Check curl
  if ! find_curl &>/dev/null; then
    missing_deps+=("curl")
  fi

  # If missing dependencies, auto-install
  if [ ${#missing_deps[@]} -gt 0 ]; then
    echo "→ Missing required tools: ${missing_deps[*]}" >&2
    echo "  Auto-installing..." >&2
    echo "" >&2

    case "$PLATFORM" in
      macos)
        install_cmd="brew install ${missing_deps[*]}"
        if ! command -v brew &>/dev/null; then
          echo "Error: Homebrew not detected" >&2
          echo "  Please install Homebrew first: https://brew.sh" >&2
          echo "  Or install manually: ${missing_deps[*]}" >&2
          exit 1
        fi
        ;;
      linux)
        # Detect Linux distribution
        if command -v apt-get &>/dev/null; then
          install_cmd="sudo apt-get update && sudo apt-get install -y ${missing_deps[*]}"
        elif command -v yum &>/dev/null; then
          install_cmd="sudo yum install -y ${missing_deps[*]}"
        elif command -v dnf &>/dev/null; then
          install_cmd="sudo dnf install -y ${missing_deps[*]}"
        elif command -v pacman &>/dev/null; then
          install_cmd="sudo pacman -S --noconfirm ${missing_deps[*]}"
        else
          echo "Error: No supported package manager detected" >&2
          echo "  Please install manually: ${missing_deps[*]}" >&2
          exit 1
        fi
        ;;
      windows)
        if command -v choco &>/dev/null; then
          install_cmd="choco install -y ${missing_deps[*]}"
        elif command -v scoop &>/dev/null; then
          install_cmd="scoop install ${missing_deps[*]}"
        else
          echo "Error: Chocolatey or Scoop not detected" >&2
          echo "  Please install a package manager first:" >&2
          echo "  - Chocolatey: https://chocolatey.org/install" >&2
          echo "  - Scoop: https://scoop.sh" >&2
          echo "  Or download manually: https://stedolan.github.io/jq/download/" >&2
          exit 1
        fi
        ;;
      *)
        echo "Error: Unsupported platform" >&2
        echo "  Please install manually: ${missing_deps[*]}" >&2
        exit 1
        ;;
    esac

    # Execute installation
    if eval "$install_cmd"; then
      echo "" >&2
      echo "✓ Dependencies installed successfully" >&2
      echo "" >&2
    else
      echo "" >&2
      echo "Error: Auto-installation failed" >&2
      echo "  Please run manually: $install_cmd" >&2
      exit 1
    fi
  fi
}

# ============================================
# First-time configuration check
# ============================================

setup_config() {
  local shell_rc
  shell_rc=$(get_shell_rc)

  echo "→ Welcome to Labnana image generation! Configuration needed." >&2
  echo "  Detected platform: $PLATFORM" >&2
  echo "" >&2

  # Check dependencies first
  check_dependencies

  # Configure API Key
  if [ -z "${LISTENHUB_API_KEY:-}" ]; then
    echo "1. API Key" >&2
    echo "   Visit https://listenhub.ai/settings/api-keys" >&2
    echo "   (Requires subscription)" >&2
    echo "" >&2
    echo -n "   Please paste your API key: " >&2
    read -r api_key

    if [ -z "$api_key" ]; then
      echo "Error: API key cannot be empty" >&2
      exit 1
    fi

    # Check if config already exists (avoid duplicate append)
    if ! grep -q "^export LISTENHUB_API_KEY=" "$shell_rc" 2>/dev/null; then
      echo "export LISTENHUB_API_KEY=\"$api_key\"" >> "$shell_rc"
    else
      # If exists, replace
      sed_inplace "$shell_rc" "s|^export LISTENHUB_API_KEY=.*|export LISTENHUB_API_KEY=\"$api_key\"|"
    fi
    export LISTENHUB_API_KEY="$api_key"
    echo "" >&2
  fi

  # Configure output path
  if [ -z "${LISTENHUB_OUTPUT_DIR:-}" ]; then
    echo "2. Output path" >&2
    echo -n "   Image save location (default: ~/Downloads): " >&2
    read -r output_dir

    # Default to ~/Downloads
    if [ -z "$output_dir" ]; then
      output_dir="$HOME/Downloads"
    fi

    # Expand ~ symbol
    output_dir="${output_dir/#\~/$HOME}"

    # Create directory if not exists
    mkdir -p "$output_dir"

    # Check if config already exists (avoid duplicate append)
    if ! grep -q "^export LISTENHUB_OUTPUT_DIR=" "$shell_rc" 2>/dev/null; then
      echo "export LISTENHUB_OUTPUT_DIR=\"$output_dir\"" >> "$shell_rc"
    else
      sed_inplace "$shell_rc" "s|^export LISTENHUB_OUTPUT_DIR=.*|export LISTENHUB_OUTPUT_DIR=\"$output_dir\"|"
    fi
    export LISTENHUB_OUTPUT_DIR="$output_dir"
    echo "" >&2
  fi

  echo "✓ Configuration saved to $shell_rc" >&2
  echo "" >&2
}

# Check and execute first-time configuration
  if [ -z "${LISTENHUB_API_KEY:-}" ] || [ -z "${LISTENHUB_OUTPUT_DIR:-}" ]; then
  setup_config
fi

# ============================================
# Parameter validation
# ============================================

if [ -z "$PROMPT" ]; then
  echo "Usage: $0 --prompt \"<prompt>\" [--size 1K|2K|4K] [--ratio 16:9|1:1|9:16|2:3|3:2|3:4|4:3|21:9] [--reference-images \"url1,url2\"]" >&2
  echo "  size: 1K | 2K | 4K (default: 2K)" >&2
  echo "  ratio: 16:9 | 1:1 | 9:16 | 2:3 | 3:2 | 3:4 | 4:3 | 21:9 (default: 16:9)" >&2
  echo "  reference-images: comma-separated URLs (max 14), e.g. \"url1,url2\"" >&2
  echo "" >&2
  echo "Examples:" >&2
  echo "  $0 --prompt \"a cute cat\" --size 2K --ratio 1:1" >&2
  echo "  $0 --prompt \"cyberpunk city at night\" --size 4K --ratio 16:9" >&2
  echo "  $0 --prompt \"similar style\" --size 2K --ratio 16:9 --reference-images \"https://example.com/ref1.jpg,https://example.com/ref2.png\"" >&2
  echo "" >&2
  echo "Legacy positional form is still supported:" >&2
  echo "  $0 \"a cute cat\" 2K 1:1" >&2
  exit 1
fi

# Validate size parameter
case "$SIZE" in
  1K|2K|4K) ;;
  *) echo "Error: size must be 1K, 2K or 4K" >&2; exit 1 ;;
esac

# Validate ratio parameter
case "$RATIO" in
  16:9|1:1|9:16|2:3|3:2|3:4|4:3|21:9) ;;
  *) echo "Error: ratio $RATIO not supported" >&2; exit 1 ;;
esac

# ============================================
# JSON construction (compatible with environments without jq)
# ============================================

# Infer MIME type from URL
detect_mime_type() {
  local url="$1"
  local ext="${url##*.}"
  ext=$(echo "$ext" | tr '[:upper:]' '[:lower:]')  # Convert to lowercase (bash 3.2 compatible)

  case "$ext" in
    jpg|jpeg) echo "image/jpeg" ;;
    png) echo "image/png" ;;
    gif) echo "image/gif" ;;
    webp) echo "image/webp" ;;
    bmp) echo "image/bmp" ;;
    *) echo "image/jpeg" ;;  # Default
  esac
}

# Build referenceImages JSON array
build_reference_images_json() {
  local urls="$1"
  local json_array="[]"

  if [ -z "$urls" ]; then
    echo "$json_array"
    return
  fi

  if command -v jq &> /dev/null; then
    # Use jq to build array
    local count=0
    json_array=$(echo "$urls" | tr ',' '\n' | while IFS= read -r url; do
      url=$(echo "$url" | sed 's/^[[:space:]]*//;s/[[:space:]]*$//')  # Trim whitespace (preserve special chars)
      if [ -n "$url" ]; then
        count=$((count + 1))
        if [ $count -gt 14 ]; then
          echo "Warning: Max 14 reference images supported, extras ignored" >&2
          break
        fi
        mime_type=$(detect_mime_type "$url")
        jq -n --arg uri "$url" --arg mime "$mime_type" \
          '{fileData: {fileUri: $uri, mimeType: $mime}}'
      fi
    done | jq -s '.')
  else
    # Manually build JSON array
    local items=()
    local count=0
    while IFS= read -r url; do
      url=$(echo "$url" | sed 's/^[[:space:]]*//;s/[[:space:]]*$//')  # Trim whitespace (preserve special chars)
      if [ -n "$url" ]; then
        count=$((count + 1))
        if [ $count -gt 14 ]; then
          echo "Warning: Max 14 reference images supported, extras ignored" >&2
          break
        fi
        mime_type=$(detect_mime_type "$url")
        # Escape special characters in URL
        local escaped_url="$url"
        escaped_url="${escaped_url//\\/\\\\}"
        escaped_url="${escaped_url//\"/\\\"}"
        items+=("{\"fileData\":{\"fileUri\":\"$escaped_url\",\"mimeType\":\"$mime_type\"}}")
      fi
    done < <(echo "$urls" | tr ',' '\n')

    if [ ${#items[@]} -gt 0 ]; then
      json_array="[$(IFS=,; echo "${items[*]}")]"
    fi
  fi

  echo "$json_array"
}

build_json_payload() {
  local prompt="$1"
  local size="$2"
  local ratio="$3"
  local ref_images="$4"

  if command -v jq &> /dev/null; then
    # When jq available, directly build complete JSON object (auto-handles all escaping)
    local ref_json
    ref_json=$(build_reference_images_json "$ref_images")

    jq -n \
      --arg prompt "$prompt" \
      --arg size "$size" \
      --arg ratio "$ratio" \
      --argjson refs "$ref_json" \
      'if ($refs | length) > 0 then
        {provider: "google", prompt: $prompt, imageConfig: {imageSize: $size, aspectRatio: $ratio}, referenceImages: $refs}
      else
        {provider: "google", prompt: $prompt, imageConfig: {imageSize: $size, aspectRatio: $ratio}}
      end'
  else
    # Without jq, manually escape JSON special characters
    local escaped_prompt="$prompt"
    # Escape in order: backslash must be first
    escaped_prompt="${escaped_prompt//\\/\\\\}"     # \ -> \\
    escaped_prompt="${escaped_prompt//\"/\\\"}"     # " -> \"
    escaped_prompt="${escaped_prompt//$'\n'/\\n}"   # newline -> \n
    escaped_prompt="${escaped_prompt//$'\r'/\\r}"   # carriage return -> \r
    escaped_prompt="${escaped_prompt//$'\t'/\\t}"   # Tab -> \t
    # Control characters (0x00-0x1F) except already handled \n\r\t, replace others with space
    escaped_prompt=$(printf '%s' "$escaped_prompt" | tr '\000-\010\013\014\016-\037' ' ')

    local base_json="{\"provider\":\"google\",\"prompt\":\"$escaped_prompt\",\"imageConfig\":{\"imageSize\":\"$size\",\"aspectRatio\":\"$ratio\"}"

    if [ -n "$ref_images" ]; then
      local ref_json
      ref_json=$(build_reference_images_json "$ref_images")
      echo "${base_json},\"referenceImages\":${ref_json}}"
    else
      echo "${base_json}}"
    fi
  fi
}

# ============================================
# API call (with retry and fallback)
# ============================================

call_api_with_retry() {
  local payload="$1"
  local attempt=1
  local timeout=$INITIAL_TIMEOUT
  local response=""
  local http_code=""
  local body=""
  local sleep_time=0

  # Find curl
  local curl_cmd
  curl_cmd=$(find_curl) || {
    echo "Error: curl command not found" >&2
    echo "  Please install curl: https://curl.se/download.html" >&2
    return 1
  }

  while [ $attempt -le $MAX_RETRIES ]; do
    echo "→ Generating... (attempt $attempt/$MAX_RETRIES, timeout ${timeout}s)" >&2

    response=$("$curl_cmd" -s -w "\n%{http_code}" -X POST \
      "$API_ENDPOINT" \
      -H "Authorization: Bearer $LISTENHUB_API_KEY" \
      -H "Content-Type: application/json" \
      -H "x-marswave-client-id: $AGENT_SKILLS_CLIENT_ID" \
      -d "$payload" \
      --max-time "$timeout" 2>&1) || true

    # Separate response body and status code
    http_code=$(echo "$response" | tail -n1)
    body=$(echo "$response" | sed '$d')

    # Reset wait time
    sleep_time=$RETRY_DELAY

    # Handle various cases
    case "$http_code" in
      200)
        echo "$body"
        return 0
        ;;
      000)
        # Connection timeout or network error
        echo "  ⚠ Network timeout, retrying in ${sleep_time}s..." >&2
        ;;
      504|502|503)
        # Server timeout/gateway error
        echo "  ⚠ Service busy (HTTP $http_code), retrying in ${sleep_time}s..." >&2
        ;;
      429)
        # Rate limiting - wait longer
        sleep_time=$((RETRY_DELAY * 3))
        echo "  ⚠ Too many requests, retrying in ${sleep_time}s..." >&2
        ;;
      401)
        echo "Error: API Key invalid or expired" >&2
        echo "  Please check LISTENHUB_API_KEY or get a new one: https://listenhub.ai/settings/api-keys" >&2
        return 1
        ;;
      402)
        echo "Error: Insufficient credits" >&2
        echo "  Please recharge: https://labnana.com/pricing" >&2
        return 1
        ;;
      400)
        local error_msg
        error_msg=$(echo "$body" | grep -o '"message":"[^"]*"' | cut -d'"' -f4 2>/dev/null || echo "Request parameter error")
        echo "Error: $error_msg" >&2
        return 1
        ;;
      *)
        echo "  ⚠ Unknown error (HTTP $http_code), retrying in ${sleep_time}s..." >&2
        ;;
    esac

    # Wait before retry
    sleep $sleep_time
    attempt=$((attempt + 1))
    # Increment timeout (fallback strategy)
    timeout=$((timeout + 30))
  done

  echo "Error: Failed after multiple retries" >&2
  echo "  Last status: HTTP $http_code" >&2
  echo "  Suggestion: Try again later, or check network connection" >&2
  return 1
}

# ============================================
# Extract base64 image data
# ============================================

extract_image_data() {
  local body="$1"
  local base64_data=""

  if command -v jq &> /dev/null; then
    # When jq available, try multiple possible paths
    base64_data=$(echo "$body" | jq -r '
      .candidates[0].content.parts[0].inlineData.data //
      .candidates[0].content.parts[0].inline_data.data //
      .data //
      empty
    ' 2>/dev/null) || true
  else
    # Without jq, use grep to extract (base64 charset doesn't contain quotes, so this method is safe)
    base64_data=$(echo "$body" | grep -o '"data":"[^"]*"' | tail -1 | cut -d'"' -f4 2>/dev/null) || true
  fi

  if [ -z "$base64_data" ] || [ "$base64_data" = "null" ]; then
    return 1
  fi

  echo "$base64_data"
}

# ============================================
# Generate unique filename
# ============================================

generate_unique_filename() {
  local base_dir="$1"
  local prefix="$2"
  local ext="$3"
  local timestamp
  local random_suffix
  local filename

  # Timestamp accurate to seconds
  timestamp=$(date +%Y%m%d-%H%M%S)

  # Add 4-digit random to avoid same-second collision
  random_suffix=$(printf '%04d' "$(get_random)")

  filename="${base_dir}/${prefix}-${timestamp}-${random_suffix}.${ext}"

  # Extreme case: if file still exists, append more random
  while [ -f "$filename" ]; do
    random_suffix=$(printf '%04d' "$(get_random)")
    filename="${base_dir}/${prefix}-${timestamp}-${random_suffix}.${ext}"
  done

  echo "$filename"
}

# ============================================
# Main flow
# ============================================

# Ensure output directory exists
mkdir -p "$LISTENHUB_OUTPUT_DIR"

# Build request
PAYLOAD=$(build_json_payload "$PROMPT" "$SIZE" "$RATIO" "$REFERENCE_IMAGES")

# Call API (with retry)
BODY=$(call_api_with_retry "$PAYLOAD") || exit 1

# Extract image data
BASE64_DATA=$(extract_image_data "$BODY")
if [ -z "$BASE64_DATA" ]; then
  echo "Error: Cannot extract image data" >&2
  echo "  Response preview: $(echo "$BODY" | head -c 200)" >&2
  exit 1
fi

# Generate unique filename
OUTPUT_FILE=$(generate_unique_filename "$LISTENHUB_OUTPUT_DIR" "listenhub" "jpg")
TEMP_OUTPUT_FILE="$OUTPUT_FILE"  # Mark as temp file for trap cleanup

# Decode and save (cross-platform)
if ! base64_decode "$BASE64_DATA" "$OUTPUT_FILE"; then
  echo "Error: base64 decode failed" >&2
  echo "  Try installing openssl or check base64 command" >&2
  exit 1
fi

# Verify file
if [ ! -s "$OUTPUT_FILE" ]; then
  echo "Error: Generated file is empty" >&2
  rm -f "$OUTPUT_FILE"
  exit 1
fi

# Success, cancel temp file mark
TEMP_OUTPUT_FILE=""

# Output result
FILE_SIZE=$(ls -lh "$OUTPUT_FILE" | awk '{print $5}')
echo "✓ $OUTPUT_FILE ($FILE_SIZE)" >&2
echo "$OUTPUT_FILE"

```

### scripts/generate-video.sh

```bash
#!/usr/bin/env bash
# Generate video file from explainer episode
# Usage: ./generate-video.sh --episode <episode-id>

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "${SCRIPT_DIR}/lib.sh"

EPISODE_ID=""

usage() {
  echo "Usage: $0 --episode <episode-id>" >&2
}

while [ $# -gt 0 ]; do
  case "$1" in
    --episode)
      EPISODE_ID="${2:-}"
      shift 2
      ;;
    --help)
      usage
      exit 0
      ;;
    *)
      echo "Error: Unknown argument $1" >&2
      usage
      exit 1
      ;;
  esac
done

if [ -z "$EPISODE_ID" ]; then
  echo "Error: --episode is required" >&2
  usage
  exit 1
fi

validate_id "$EPISODE_ID" "episode-id"

api_post "storybook/episodes/${EPISODE_ID}/video" "{}"

```

### scripts/get-speakers.sh

```bash
#!/usr/bin/env bash
# Get available speakers list via ListenHub API
# Usage: ./get-speakers.sh [--language zh|en]

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "${SCRIPT_DIR}/lib.sh"

LANGUAGE="zh"

while [ $# -gt 0 ]; do
  case "$1" in
    --language|--lang)
      LANGUAGE="${2:-}"
      shift 2
      ;;
    --help)
      echo "Usage: ./get-speakers.sh [--language zh|en]" >&2
      exit 0
      ;;
    *)
      echo "Error: Unknown argument $1" >&2
      echo "Usage: ./get-speakers.sh [--language zh|en]" >&2
      exit 1
      ;;
  esac
done

if [[ ! "$LANGUAGE" =~ ^(zh|en)$ ]]; then
  echo "Error: Invalid language '$LANGUAGE'. Must be: zh | en" >&2
  exit 1
fi

api_get "speakers/list?language=${LANGUAGE}"

```

### scripts/lib.sh

```bash
#!/usr/bin/env bash
# Shared environment checks for ListenHub scripts
# Source this at the beginning of each script

set -euo pipefail

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
SKILL_DIR="$(dirname "$SCRIPT_DIR")"
VERSION_FILE="${SKILL_DIR}/VERSION"
REMOTE_VERSION_URL="https://raw.githubusercontent.com/marswaveai/skills/main/skills/listenhub/VERSION"

# === Version Check (non-blocking) ===

check_version() {
  # Skip if no local VERSION file
  [ -f "$VERSION_FILE" ] || return 0

  local local_ver remote_ver http_code response
  local_ver=$(cat "$VERSION_FILE" 2>/dev/null | tr -d '[:space:]')

  # Validate local version before integer comparisons
  [[ "$local_ver" =~ ^[0-9]+\.[0-9]+\.[0-9]+$ ]] || return 0

  # Fetch remote version with 5s timeout, check HTTP status
  response=$(curl -sS --max-time 5 -w "\n%{http_code}" "$REMOTE_VERSION_URL" 2>/dev/null) || return 0
  http_code=$(echo "$response" | tail -1)
  remote_ver=$(echo "$response" | head -1 | tr -d '[:space:]')

  # Only compare if HTTP 200 and valid semver-like format
  [[ "$http_code" == "200" && "$remote_ver" =~ ^[0-9]+\.[0-9]+\.[0-9]+$ ]] || return 0

  # Same version, skip
  [ "$local_ver" != "$remote_ver" ] || return 0

  # Parse semver: major.minor.patch
  local local_major local_minor local_patch
  local remote_major remote_minor remote_patch

  IFS='.' read -r local_major local_minor local_patch <<< "$local_ver"
  IFS='.' read -r remote_major remote_minor remote_patch <<< "$remote_ver"

  # Notify if remote version is newer (major/minor bump or patch bump)
  if [ "$remote_major" -gt "$local_major" ] || \
     { [ "$remote_major" -eq "$local_major" ] && [ "$remote_minor" -gt "$local_minor" ]; } || \
     { [ "$remote_major" -eq "$local_major" ] && [ "$remote_minor" -eq "$local_minor" ] && [ "$remote_patch" -gt "$local_patch" ]; }; then
    echo "┌─────────────────────────────────────────────────────┐" >&2
    echo "│  Update available: $local_ver → $remote_ver" >&2
    echo "│  Run: npx skills add marswaveai/skills             │" >&2
    echo "└─────────────────────────────────────────────────────┘" >&2
  fi
}

# Run version check (notify-only, no auto-update)
# Set LISTENHUB_SKIP_VERSION_CHECK=1 to disable
if [ "${LISTENHUB_SKIP_VERSION_CHECK:-}" != "1" ]; then
  check_version
fi

# Load API key from shell config (try multiple sources)
# Extract value safely without eval to prevent code injection
if [ -n "${LISTENHUB_API_KEY:-}" ]; then
  : # Already set, skip loading
else
  _extract_api_key() {
    local file="$1"
    [ -f "$file" ] || return 1
    # Match: export LISTENHUB_API_KEY="value" or export LISTENHUB_API_KEY='value' or unquoted
    local line
    line=$(grep -m1 '^[[:space:]]*export[[:space:]]\{1,\}LISTENHUB_API_KEY=' "$file" 2>/dev/null) || return 1
    # Strip everything up to and including the first =
    local value="${line#*=}"
    # Extract based on quoting style
    case "$value" in
      \"*)
        # Double-quoted: extract between first and last double quote
        value="${value#\"}"
        value="${value%\"*}"
        ;;
      \'*)
        # Single-quoted: extract between first and last single quote
        value="${value#\'}"
        value="${value%\'*}"
        ;;
      *)
        # Unquoted: strip trailing comments and whitespace
        value="${value%%#*}"
        value="${value%"${value##*[![:space:]]}"}"
        ;;
    esac
    [ -n "$value" ] && printf '%s' "$value"
  }
  _key=$(_extract_api_key ~/.zshrc) || _key=$(_extract_api_key ~/.bashrc) || _key=""
  if [ -n "$_key" ]; then
    export LISTENHUB_API_KEY="$_key"
  fi
  unset -f _extract_api_key
  unset _key
fi

# === Environment Checks ===

check_curl() {
  if ! command -v curl &>/dev/null; then
    echo "Error: curl not found (should be pre-installed on most systems)" >&2
    exit 127
  fi
}

check_jq() {
  if ! command -v jq &>/dev/null; then
    cat >&2 <<'EOF'
Error: jq not found

Install:
  macOS (Homebrew): brew install jq
  Ubuntu/Debian: apt-get install jq
  RHEL/CentOS: yum install jq
  Fedora: dnf install jq
  Arch: pacman -S jq
EOF
    exit 127
  fi
}

check_api_key() {
  if [ -z "${LISTENHUB_API_KEY:-}" ]; then
    cat >&2 <<'EOF'
Error: LISTENHUB_API_KEY not set

Setup:
  1. Get API key from https://listenhub.ai/settings/api-keys
  2. Add to ~/.zshrc or ~/.bashrc:
     export LISTENHUB_API_KEY="lh_sk_..."
  3. Run: source ~/.zshrc
EOF
    exit 1
  fi
}

# Run checks
check_curl
check_api_key

# === Input Validation ===

# Validate that an ID contains only safe characters (alphanumeric, hyphen, underscore)
# Usage: validate_id "value" "field_name"
validate_id() {
  local value="$1"
  local name="${2:-id}"
  if [[ ! "$value" =~ ^[a-zA-Z0-9_-]+$ ]]; then
    echo "Error: Invalid $name (only alphanumeric, hyphen, underscore allowed): $value" >&2
    exit 1
  fi
}

# === API Helpers ===

API_BASE="https://api.marswave.ai/openapi/v1"
AGENT_SKILLS_CLIENT_ID="PJBkELS1o_q9nJ~NzF2_Fmr21TNX&~eoJR49FFdFhD3U"

# Trim leading and trailing whitespace
trim_ws() {
  local input="$1"
  input="${input#"${input%%[![:space:]]*}"}"
  input="${input%"${input##*[![:space:]]}"}"
  printf '%s' "$input"
}

# Make authenticated POST request with JSON body
# Usage: api_post "endpoint" 'json_body'
api_post() {
  local endpoint="$1"
  local body="$2"

  curl -sS -X POST "${API_BASE}/${endpoint}" \
    -H "Authorization: Bearer ${LISTENHUB_API_KEY}" \
    -H "Content-Type: application/json" \
    -H "x-marswave-client-id: ${AGENT_SKILLS_CLIENT_ID}" \
    -d "$body"
}

# Make authenticated GET request
# Usage: api_get "endpoint"
api_get() {
  local endpoint="$1"

  curl -sS -X GET "${API_BASE}/${endpoint}" \
    -H "Authorization: Bearer ${LISTENHUB_API_KEY}" \
    -H "x-marswave-client-id: ${AGENT_SKILLS_CLIENT_ID}"
}

```