Back to skills
SkillHub ClubShip Full StackFull StackBackend

acestep

Use ACE-Step API to generate music, edit songs, and remix music. Supports text-to-music, lyrics generation, audio continuation, and audio repainting. Use this skill when users mention generating music, creating songs, music production, remix, or audio continuation.

Packaged view

This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.

Stars
7,984
Hot score
99
Updated
March 20, 2026
Overall rating
C4.0
Composite score
4.0
Best-practice grade
C62.8

Install command

npx @skill-hub/cli install ace-step-ace-step-1-5-acestep

Repository

ace-step/ACE-Step-1.5

Skill path: .claude/skills/acestep

Use ACE-Step API to generate music, edit songs, and remix music. Supports text-to-music, lyrics generation, audio continuation, and audio repainting. Use this skill when users mention generating music, creating songs, music production, remix, or audio continuation.

Open repository

Best for

Primary workflow: Ship Full Stack.

Technical facets: Full Stack, Backend.

Target audience: everyone.

License: Unknown.

Original source

Catalog source: SkillHub Club.

Repository owner: ace-step.

This is still a mirrored public skill entry. Review the repository before installing into production workflows.

What it helps with

  • Install acestep into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
  • Review https://github.com/ace-step/ACE-Step-1.5 before adding acestep to shared team environments
  • Use acestep for development workflows

Works across

Claude CodeCodex CLIGemini CLIOpenCode

Favorites: 0.

Sub-skills: 0.

Aggregator: No.

Original source / Raw SKILL.md

---
name: acestep
description: Use ACE-Step API to generate music, edit songs, and remix music. Supports text-to-music, lyrics generation, audio continuation, and audio repainting. Use this skill when users mention generating music, creating songs, music production, remix, or audio continuation.
allowed-tools: Read, Write, Bash, Skill
---

# ACE-Step Music Generation Skill

Use ACE-Step V1.5 API for music generation. **Always use `scripts/acestep.sh` script** — do NOT call API endpoints directly.

## Quick Start

```bash
# 1. cd to this skill's directory
cd {project_root}/{.claude or .codex}/skills/acestep/

# 2. Check API service health
./scripts/acestep.sh health

# 3. Generate with lyrics (recommended)
./scripts/acestep.sh generate -c "pop, female vocal, piano" -l "[Verse] Your lyrics here..." --duration 120 --language zh

# 4. Output saved to: {project_root}/acestep_output/
```

## Workflow

For user requests requiring vocals:
1. Use the **acestep-songwriting** skill for lyrics writing, caption creation, duration/BPM/key selection
2. Write complete, well-structured lyrics yourself based on the songwriting guide
3. Generate using Caption mode with `-c` and `-l` parameters

Only use Simple/Random mode (`-d` or `random`) for quick inspiration or instrumental exploration.

If the user needs a simple music video, use the **acestep-simplemv** skill to render one with waveform visualization and synced lyrics.

**MV Production Requirements**: Making a simple MV requires three additional skills to be installed:
- **acestep-songwriting** — for writing lyrics and planning song structure
- **acestep-lyrics-transcription** — for transcribing audio to timestamped lyrics (LRC)
- **acestep-simplemv** — for rendering the final music video
- **acestep-thumbnail** (optional) — for generating cover art / MV background images via Gemini API

**MV Background Image**: When the user requests MV production, ask whether they want a background image for the video:
1. **Generate via Gemini** — use the **acestep-thumbnail** skill (requires Gemini API key configuration)
2. **Provide an existing image** — user supplies a local image path
3. **Skip** — use the default animated gradient background (no image needed)

Use `AskUserQuestion` to let the user choose before proceeding with MV rendering.

**Parallel Processing**: Lyrics transcription and thumbnail generation are independent tasks. When the user chooses to generate a background image, run **acestep-lyrics-transcription** and **acestep-thumbnail** in parallel (e.g. via two concurrent Agent calls) to save time, then use both outputs for the final MV render.

## Script Commands

**CRITICAL - Complete Lyrics Input**: When providing lyrics via the `-l` parameter, you MUST pass ALL lyrics content WITHOUT any omission:
- If user provides lyrics, pass the ENTIRE text they give you
- If you generate lyrics yourself, pass the COMPLETE lyrics you created
- NEVER truncate, shorten, or pass only partial lyrics
- Missing lyrics will result in incomplete or incoherent songs

**Music Parameters**: Use the **acestep-songwriting** skill for guidance on duration, BPM, key scale, and time signature.

```bash
# need to cd to this skill's directory first
cd {project_root}/{.claude or .codex}/skills/acestep/

# Caption mode - RECOMMENDED: Write lyrics first, then generate
./scripts/acestep.sh generate -c "Electronic pop, energetic synths" -l "[Verse] Your complete lyrics
[Chorus] Full chorus here..." --duration 120 --bpm 128

# Instrumental only
./scripts/acestep.sh generate "Jazz with saxophone"

# Quick exploration (Simple/Random mode)
./scripts/acestep.sh generate -d "A cheerful song about spring"
./scripts/acestep.sh random

# Cover / Repainting from source audio
./scripts/acestep.sh cover song.mp3 -c "Rock cover style" -l "[Verse] Lyrics..." --duration 120 --bpm 128
./scripts/acestep.sh generate --src-audio song.mp3 --task-type repaint -c "Pop" --repaint-start 30 --repaint-end 60

# Music attribute options
./scripts/acestep.sh generate "Rock" --duration 60 --bpm 120 --key-scale "C major" --time-sig "4/4"
./scripts/acestep.sh generate "Rock" --duration 60 --batch 2
./scripts/acestep.sh generate "EDM" --no-thinking    # Faster

# Other commands
./scripts/acestep.sh status <job_id>
./scripts/acestep.sh health
./scripts/acestep.sh models
```

### Cover / Audio Repainting

The `cover` command generates music based on a source audio file. The audio is base64-encoded and sent to the API.

```bash
# Cover: regenerate with new style/lyrics, preserving melody structure
./scripts/acestep.sh cover input.mp3 -c "Jazz cover" -l "[Verse] New lyrics..." --duration 120

# Repainting: modify a specific region of the audio
./scripts/acestep.sh generate --src-audio input.mp3 --task-type repaint -c "Pop ballad" --repaint-start 30 --repaint-end 90

# Cover options
#   --src-audio         Source audio file path
#   --task-type         cover (default with --src-audio), repaint, text2music
#   --cover-strength    0.0-1.0 (default: 1.0, higher = closer to source)
#   --repaint-start     Repainting start position (seconds)
#   --repaint-end       Repainting end position (seconds)
#   --key-scale         Musical key (e.g. "E minor")
#   --time-signature    Time signature (e.g. "4/4")
```

**Note**: For cloud API usage, large audio files may be rejected by Cloudflare. Compress audio before uploading if needed (e.g. using ffmpeg: `ffmpeg -i input.mp3 -b:a 64k -ar 24000 -ac 1 compressed.mp3`).

## Output Files

After generation, the script automatically saves results to the `acestep_output` folder in the project root (same level as `.claude`):

```
project_root/
├── .claude/
│   └── skills/acestep/...
├── acestep_output/          # Output directory
│   ├── <job_id>.json         # Complete task result (JSON)
│   ├── <job_id>_1.mp3        # First audio file
│   ├── <job_id>_2.mp3        # Second audio file (if batch_size > 1)
│   └── ...
└── ...
```

### JSON Result Structure

**Important**: When LM enhancement is enabled (`use_format=true`), the final synthesized content may differ from your input. Check the JSON file for actual values:

| Field | Description |
|-------|-------------|
| `prompt` | **Actual caption** used for synthesis (may be LM-enhanced) |
| `lyrics` | **Actual lyrics** used for synthesis (may be LM-enhanced) |
| `metas.prompt` | Original input caption |
| `metas.lyrics` | Original input lyrics |
| `metas.bpm` | BPM used |
| `metas.keyscale` | Key scale used |
| `metas.duration` | Duration in seconds |
| `generation_info` | Detailed timing and model info |
| `seed_value` | Seeds used (for reproducibility) |
| `lm_model` | LM model name |
| `dit_model` | DiT model name |

To get the actual synthesized lyrics, parse the JSON and read the top-level `lyrics` field, not `metas.lyrics`.

## Configuration

**Important**: Configuration follows this priority (high to low):

1. **Command line arguments** > **config.json defaults**
2. User-specified parameters **temporarily override** defaults but **do not modify** config.json
3. Only `config --set` command **permanently modifies** config.json

### Default Config File (`scripts/config.json`)

```json
{
  "api_url": "http://127.0.0.1:8001",
  "api_key": "",
  "api_mode": "completion",
  "generation": {
    "thinking": true,
    "use_format": false,
    "use_cot_caption": true,
    "use_cot_language": false,
    "batch_size": 1,
    "audio_format": "mp3",
    "vocal_language": "en"
  }
}
```

| Option | Default | Description |
|--------|---------|-------------|
| `api_url` | `http://127.0.0.1:8001` | API server address |
| `api_key` | `""` | API authentication key (optional) |
| `api_mode` | `completion` | API mode: `completion` (OpenRouter, default) or `native` (polling) |
| `generation.thinking` | `true` | Enable 5Hz LM (higher quality, slower) |
| `generation.audio_format` | `mp3` | Output format (mp3/wav/flac) |
| `generation.vocal_language` | `en` | Vocal language |

## Prerequisites - ACE-Step API Service

**IMPORTANT**: This skill requires the ACE-Step API server to be running.

### Required Dependencies

The `scripts/acestep.sh` script requires: **curl** and **jq**.

```bash
# Check dependencies
curl --version
jq --version
```

If jq is not installed, the script will attempt to install it automatically. If automatic installation fails:
- **Windows**: `choco install jq` or download from https://jqlang.github.io/jq/download/
- **macOS**: `brew install jq`
- **Linux**: `sudo apt-get install jq` (Debian/Ubuntu) or `sudo dnf install jq` (Fedora)

### Before First Use

**You MUST check the API key and URL status before proceeding.** Run:

```bash
cd "{project_root}/{.claude or .codex}/skills/acestep/" && bash ./scripts/acestep.sh config --check-key
cd "{project_root}/{.claude or .codex}/skills/acestep/" && bash ./scripts/acestep.sh config --get api_url
```

#### Case 1: Using Official Cloud API (`https://api.acemusic.ai`) without API key

If `api_url` is `https://api.acemusic.ai` and `api_key` is `empty`, you MUST stop and guide the user to configure their key:

1. Tell the user: "You're using the ACE-Step official cloud API, but no API key is configured. An API key is required to use this service."
2. Explain how to get a key: API keys are currently available through [acemusic.ai](https://acemusic.ai/api-key) for free.
3. Use `AskUserQuestion` to ask the user to provide their API key.
4. Once provided, configure it:
   ```bash
   cd "{project_root}/{.claude or .codex}/skills/acestep/" && bash ./scripts/acestep.sh config --set api_key <KEY>
   ```
5. Additionally, inform the user: "If you also want to render music videos (MV), it's recommended to configure a lyrics transcription API key as well (OpenAI Whisper or ElevenLabs Scribe), so that lyrics can be automatically transcribed with accurate timestamps. You can configure it later via the `acestep-lyrics-transcription` skill."

#### Case 2: API key is configured

Verify the API endpoint: `./scripts/acestep.sh health` and proceed with music generation.

#### Case 3: Using local/custom API without key

Local services (`http://127.0.0.1:*`) typically don't require a key. Verify with `./scripts/acestep.sh health` and proceed.

If health check fails:
- Ask: "Do you have ACE-Step installed?"
- **If installed but not running**: Use the acestep-docs skill to help them start the service
- **If not installed**: Use acestep-docs skill to guide through installation

### Service Configuration

**Official Cloud API:** ACE-Step provides an official API endpoint at `https://api.acemusic.ai`. To use it:
```bash
./scripts/acestep.sh config --set api_url "https://api.acemusic.ai"
./scripts/acestep.sh config --set api_key "your-key"
./scripts/acestep.sh config --set api_mode completion
```
API keys are currently available through [acemusic.ai](https://acemusic.ai/api-key) for free. 

**Local Service (Default):** No configuration needed — connects to `http://127.0.0.1:8001`.

**Custom Remote Service:** Update `scripts/config.json` or use:
```bash
./scripts/acestep.sh config --set api_url "http://remote-server:8001"
./scripts/acestep.sh config --set api_key "your-key"
```

**API Key Handling**: When checking whether an API key is configured, use `config --check-key` which only reports `configured` or `empty` without printing the actual key. **NEVER use `config --get api_key`** or read `config.json` directly — these would expose the user's API key. The `config --list` command is safe — it automatically masks API keys as `***` in output.

### API Mode

The skill supports two API modes. Switch via `api_mode` in `scripts/config.json`:

| Mode | Endpoint | Description |
|------|----------|-------------|
| `completion` (default) | `/v1/chat/completions` | OpenRouter-compatible, sync request, audio returned as base64 |
| `native` | `/release_task` + `/query_result` | Async polling mode, supports all parameters |

**Switch mode:**
```bash
./scripts/acestep.sh config --set api_mode completion
./scripts/acestep.sh config --set api_mode native
```

**Completion mode notes:**
- No polling needed — single request returns result directly
- Audio is base64-encoded inline in the response (auto-decoded and saved)
- `inference_steps`, `infer_method`, `shift` are not configurable (server defaults)
- `--no-wait` and `status` commands are not applicable in completion mode
- Requires `model` field — auto-detected from `/v1/models` if not specified

### Using acestep-docs Skill for Setup Help

**IMPORTANT**: For installation and startup, always use the acestep-docs skill to get complete and accurate guidance.

**DO NOT provide simplified startup commands** - each user's environment may be different. Always guide them to use acestep-docs for proper setup.

---

For API debugging, see [API Reference](./api-reference.md).


---

## Referenced Files

> The following files are referenced in this skill and included for context.

### api-reference.md

```markdown
# ACE-Step API Reference

> For debugging and advanced usage only. Normal operations should use `scripts/acestep.sh`.

## Native Mode Endpoints

All responses wrapped: `{"data": <payload>, "code": 200, "error": null, "timestamp": ...}`

| Endpoint | Method | Description |
|----------|--------|-------------|
| `/health` | GET | Health check |
| `/release_task` | POST | Create generation task |
| `/query_result` | POST | Query task status, body: `{"task_id_list": ["id"]}` |
| `/v1/models` | GET | List available models |
| `/v1/audio?path={path}` | GET | Download audio file |

## Completion Mode Endpoints

| Endpoint | Method | Description |
|----------|--------|-------------|
| `/v1/chat/completions` | POST | Generate music (OpenRouter-compatible) |
| `/v1/models` | GET | List available models (OpenRouter format) |

## Query Result Response

```json
{
  "data": [{
    "task_id": "xxx",
    "status": 1,
    "result": "[{\"file\":\"/v1/audio?path=...\",\"metas\":{\"bpm\":120,\"duration\":60,\"keyscale\":\"C Major\"}}]"
  }]
}
```

Status codes: `0` = processing, `1` = success, `2` = failed

## Completion Mode Request (`/v1/chat/completions`)

**Caption mode** — prompt and lyrics wrapped in XML tags inside message content:
```json
{
  "model": "acestep/ACE-Step-v1.5",
  "messages": [{"role": "user", "content": "<prompt>Jazz with saxophone</prompt><lyrics>[Verse] Hello...</lyrics>"}],
  "stream": false,
  "thinking": true,
  "use_format": false,
  "audio_config": {"duration": 90, "bpm": 110, "format": "mp3", "vocal_language": "en"}
}
```

**Simple mode** — plain text message, set `sample_mode: true`:
```json
{
  "model": "acestep/ACE-Step-v1.5",
  "messages": [{"role": "user", "content": "A cheerful pop song about spring"}],
  "stream": false,
  "sample_mode": true,
  "thinking": true
}
```

## Completion Mode Response

```json
{
  "id": "chatcmpl-abc123",
  "choices": [{
    "message": {
      "role": "assistant",
      "content": "## Metadata\n**Caption:** ...\n**BPM:** 128\n\n## Lyrics\n...",
      "audio": [{"type": "audio_url", "audio_url": {"url": "data:audio/mpeg;base64,..."}}]
    },
    "finish_reason": "stop"
  }]
}
```

Audio is base64-encoded inline — the script auto-decodes and saves to `acestep_output/`.

## Request Parameters (`/release_task`)

Parameters can be placed in `param_obj` object.

### Generation Modes

| Mode | Usage | When to Use |
|------|-------|-------------|
| **Caption** (Recommended) | `generate -c "style" -l "lyrics"` | For vocal songs - write lyrics yourself first |
| **Simple** | `generate -d "description"` | Quick exploration, LM generates everything |
| **Random** | `random` | Random generation for inspiration |

### Core Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `prompt` | string | "" | Music style description (Caption mode) |
| `lyrics` | string | "" | **Full lyrics content** - Pass ALL lyrics without omission. Use `[inst]` for instrumental. Partial/truncated lyrics = incomplete songs |
| `sample_mode` | bool | false | Enable Simple/Random mode |
| `sample_query` | string | "" | Description for Simple mode |
| `thinking` | bool | false | Enable 5Hz LM for audio code generation |
| `use_format` | bool | false | Use LM to enhance caption/lyrics |
| `model` | string | - | DiT model name |
| `batch_size` | int | 1 | Number of audio files to generate |

### Music Attributes

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `audio_duration` | float | - | Duration in seconds |
| `bpm` | int | - | Tempo (beats per minute) |
| `key_scale` | string | "" | Key (e.g. "C Major") |
| `time_signature` | string | "" | Time signature (e.g. "4/4") |
| `vocal_language` | string | "en" | Language code (en, zh, ja, etc.) |
| `audio_format` | string | "mp3" | Output format (mp3/wav/flac) |

### Generation Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `inference_steps` | int | 8 | Diffusion steps |
| `guidance_scale` | float | 7.0 | CFG scale |
| `seed` | int | -1 | Random seed (-1 for random) |
| `infer_method` | string | "ode" | Diffusion method (ode/sde) |

### Audio Task Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `task_type` | string | "text2music" | text2music / continuation / repainting |
| `src_audio_path` | string | - | Source audio for continuation |
| `repainting_start` | float | 0.0 | Repainting start position (seconds) |
| `repainting_end` | float | - | Repainting end position (seconds) |

### Example Request (Simple Mode)

```json
{
  "sample_mode": true,
  "sample_query": "A cheerful pop song about spring",
  "thinking": true,
  "param_obj": {
    "duration": 60,
    "bpm": 120,
    "language": "en"
  },
  "batch_size": 2
}
```

```

### scripts/acestep.sh

```bash
#!/bin/bash
#
# ACE-Step Music Generation CLI (Bash + Curl + jq)
#
# Requirements: curl, jq
#
# Usage:
#   ./acestep.sh generate "Music description" [options]
#   ./acestep.sh random [--no-thinking]
#   ./acestep.sh status <job_id>
#   ./acestep.sh models
#   ./acestep.sh health
#   ./acestep.sh config [--get|--set|--reset]
#
# Output:
#   - Results saved to output/<job_id>.json
#   - Audio files downloaded to output/<job_id>_1.mp3, output/<job_id>_2.mp3, ...

set -e

# Ensure UTF-8 encoding for non-ASCII characters (Japanese, Chinese, etc.)
export LANG="${LANG:-en_US.UTF-8}"
export LC_ALL="${LC_ALL:-en_US.UTF-8}"

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
CONFIG_FILE="${SCRIPT_DIR}/config.json"
# Output dir at same level as .claude (go up 4 levels from scripts/)
OUTPUT_DIR="$(cd "${SCRIPT_DIR}/../../../.." && pwd)/acestep_output"
DEFAULT_API_URL="http://127.0.0.1:8001"
STAR_MARKER_FILE="${SCRIPT_DIR}/.first_gen_done"

# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
CYAN='\033[0;36m'
NC='\033[0m'
BOLD='\033[1m'

# Show GitHub star prompt on first successful generation
show_star_prompt() {
    if [ ! -f "$STAR_MARKER_FILE" ]; then
        touch "$STAR_MARKER_FILE"
        echo ""
        echo -e "${YELLOW}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
        echo -e "${BOLD}  ACE-Step is free and open-source.${NC}"
        echo -e "  If you enjoyed this, a ${YELLOW}★ Star${NC} on GitHub means a lot to us!"
        echo -e "  ${CYAN}→ https://github.com/ace-step/ACE-Step-1.5${NC}"
        echo -e "${YELLOW}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
    fi
}

# Check dependencies
check_deps() {
    if ! command -v curl &> /dev/null; then
        echo -e "${RED}Error: curl is required but not installed.${NC}"
        exit 1
    fi
    if ! command -v jq &> /dev/null; then
        echo -e "${RED}Error: jq is required but not installed.${NC}"
        echo "Install: apt install jq / brew install jq / choco install jq"
        exit 1
    fi
}

# JSON value extractor using jq
# Usage: json_get "$json" ".key" or json_get "$json" ".nested.key"
json_get() {
    local json="$1"
    local path="$2"
    echo "$json" | jq -r "$path // empty" 2>/dev/null
}

# Extract array values using jq
json_get_array() {
    local json="$1"
    local path="$2"
    echo "$json" | jq -r "$path[]? // empty" 2>/dev/null
}

# Ensure output directory exists
ensure_output_dir() {
    mkdir -p "$OUTPUT_DIR"
}

# Default config
DEFAULT_CONFIG='{
  "api_url": "http://127.0.0.1:8001",
  "api_key": "",
  "api_mode": "native",
  "generation": {
    "thinking": true,
    "use_format": true,
    "use_cot_caption": true,
    "use_cot_language": true,
    "audio_format": "mp3",
    "vocal_language": "en"
  }
}'

# Ensure config file exists
ensure_config() {
    if [ ! -f "$CONFIG_FILE" ]; then
        local example="${SCRIPT_DIR}/config.example.json"
        if [ -f "$example" ]; then
            cp "$example" "$CONFIG_FILE"
            echo -e "${YELLOW}Config file created from config.example.json. Please configure your settings:${NC}"
            echo -e "  ${CYAN}./scripts/acestep.sh config --set api_url <url>${NC}"
            echo -e "  ${CYAN}./scripts/acestep.sh config --set api_key <key>${NC}"
        else
            echo "$DEFAULT_CONFIG" > "$CONFIG_FILE"
        fi
    fi
}

# Get config value using jq
get_config() {
    local key="$1"
    ensure_config
    # Convert dot notation to jq path: "generation.thinking" -> ".generation.thinking"
    local jq_path=".${key}"
    local value
    # Don't use // operator as it treats boolean false as falsy
    value=$(jq -r "$jq_path" "$CONFIG_FILE" 2>/dev/null)
    # Remove any trailing whitespace/newlines (Windows compatibility)
    # Return empty string if value is "null" (key doesn't exist)
    if [ "$value" = "null" ]; then
        echo ""
    else
        echo "$value" | tr -d '\r\n'
    fi
}

# Normalize boolean value for jq --argjson
normalize_bool() {
    local val="$1"
    local default="${2:-false}"
    case "$val" in
        true|True|TRUE|1) echo "true" ;;
        false|False|FALSE|0) echo "false" ;;
        *) echo "$default" ;;
    esac
}

# Set config value using jq
set_config() {
    local key="$1"
    local value="$2"
    ensure_config

    local tmp_file="${CONFIG_FILE}.tmp"
    local jq_path=".${key}"

    # Determine value type and set accordingly
    if [ "$value" = "true" ] || [ "$value" = "false" ]; then
        jq "$jq_path = $value" "$CONFIG_FILE" > "$tmp_file"
    elif [[ "$value" =~ ^-?[0-9]+$ ]] || [[ "$value" =~ ^-?[0-9]+\.[0-9]+$ ]]; then
        jq "$jq_path = $value" "$CONFIG_FILE" > "$tmp_file"
    else
        jq "$jq_path = \"$value\"" "$CONFIG_FILE" > "$tmp_file"
    fi

    mv "$tmp_file" "$CONFIG_FILE"
    echo "Set $key = $value"
}

# Load API URL
load_api_url() {
    local url=$(get_config "api_url")
    echo "${url:-$DEFAULT_API_URL}"
}

# Load API Key
load_api_key() {
    local key=$(get_config "api_key")
    echo "${key:-}"
}

# Check API health
check_health() {
    local url="$1"
    local status
    status=$(curl -s -o /dev/null -w "%{http_code}" --connect-timeout 5 "${url}/health" 2>/dev/null) || true
    [ "$status" = "200" ]
}

# Build auth header
build_auth_header() {
    local api_key=$(load_api_key)
    if [ -n "$api_key" ]; then
        echo "-H \"Authorization: Bearer ${api_key}\""
    fi
}

# Prompt for URL
prompt_for_url() {
    echo ""
    echo -e "${YELLOW}API server is not responding.${NC}"
    echo "Please enter the API URL (or press Enter for default):"
    read -p "API URL [$DEFAULT_API_URL]: " user_input
    echo "${user_input:-$DEFAULT_API_URL}"
}

# Ensure API connection
ensure_connection() {
    ensure_config
    local api_url=$(load_api_url)

    if check_health "$api_url"; then
        echo "$api_url"
        return 0
    fi

    echo -e "${YELLOW}Cannot connect to: $api_url${NC}" >&2
    local new_url=$(prompt_for_url)

    if check_health "$new_url"; then
        set_config "api_url" "$new_url" > /dev/null
        echo -e "${GREEN}Saved API URL: $new_url${NC}" >&2
        echo "$new_url"
        return 0
    fi

    echo -e "${RED}Error: Cannot connect to $new_url${NC}" >&2
    exit 1
}

# Save result to JSON file
save_result() {
    local job_id="$1"
    local result_json="$2"

    ensure_output_dir
    local output_file="${OUTPUT_DIR}/${job_id}.json"
    echo "$result_json" > "$output_file"
    echo -e "${GREEN}Result saved: $output_file${NC}"
}

# Health command
cmd_health() {
    check_deps
    ensure_config
    local api_url=$(load_api_url)

    echo "Checking API at: $api_url"
    if check_health "$api_url"; then
        echo -e "${GREEN}Status: OK${NC}"
        curl -s "${api_url}/health"
        echo ""
    else
        echo -e "${RED}Status: FAILED${NC}"
        exit 1
    fi
}

# Config command
cmd_config() {
    check_deps
    ensure_config

    local action=""
    local key=""
    local value=""

    while [[ $# -gt 0 ]]; do
        case $1 in
            --get) action="get"; key="$2"; shift 2 ;;
            --set) action="set"; key="$2"; value="$3"; shift 3 ;;
            --reset) action="reset"; shift ;;
            --list) action="list"; shift ;;
            --check-key) action="check-key"; shift ;;
            *) shift ;;
        esac
    done

    case "$action" in
        "check-key")
            local api_key=$(get_config "api_key")
            if [ -n "$api_key" ]; then
                echo "api_key: configured"
            else
                echo "api_key: empty"
            fi
            ;;
        "get")
            [ -z "$key" ] && { echo -e "${RED}Error: --get requires KEY${NC}"; exit 1; }
            local result=$(get_config "$key")
            [ -n "$result" ] && echo "$key = $result" || echo "Key not found: $key"
            ;;
        "set")
            [ -z "$key" ] || [ -z "$value" ] && { echo -e "${RED}Error: --set requires KEY VALUE${NC}"; exit 1; }
            set_config "$key" "$value"
            ;;
        "reset")
            echo "$DEFAULT_CONFIG" > "$CONFIG_FILE"
            echo -e "${GREEN}Configuration reset to defaults.${NC}"
            jq 'walk(if type == "object" and has("api_key") and (.api_key | length) > 0 then .api_key = "***" else . end)' "$CONFIG_FILE"
            ;;
        "list")
            echo "Current configuration:"
            jq 'walk(if type == "object" and has("api_key") and (.api_key | length) > 0 then .api_key = "***" else . end)' "$CONFIG_FILE"
            ;;
        *)
            echo "Config file: $CONFIG_FILE"
            echo "Output dir: $OUTPUT_DIR"
            echo "----------------------------------------"
            cat "$CONFIG_FILE"
            echo "----------------------------------------"
            echo ""
            echo "Usage:"
            echo "  config --list              Show config"
            echo "  config --get <key>         Get value"
            echo "  config --set <key> <val>   Set value"
            echo "  config --reset             Reset to defaults"
            ;;
    esac
}

# Models command
cmd_models() {
    check_deps
    local api_url=$(ensure_connection)
    local api_key=$(load_api_key)

    echo "Available Models:"
    echo "----------------------------------------"
    if [ -n "$api_key" ]; then
        curl -s -H "Authorization: Bearer ${api_key}" "${api_url}/v1/models"
    else
        curl -s "${api_url}/v1/models"
    fi
    echo ""
}

# Query job result via /query_result endpoint
query_job_result() {
    local api_url="$1"
    local job_id="$2"
    local api_key=$(load_api_key)

    local payload=$(jq -n --arg id "$job_id" '{"task_id_list": [$id]}')

    if [ -n "$api_key" ]; then
        curl -s -X POST "${api_url}/query_result" \
            -H "Content-Type: application/json; charset=utf-8" \
            -H "Authorization: Bearer ${api_key}" \
            -d "$payload"
    else
        curl -s -X POST "${api_url}/query_result" \
            -H "Content-Type: application/json; charset=utf-8" \
            -d "$payload"
    fi
}

# Parse query_result response to extract status (0=processing, 1=success, 2=failed)
# Response is wrapped: {"data": [...], "code": 200, ...}
# Uses temp file to avoid jq pipe issues with special characters on Windows
parse_query_status() {
    local response="$1"
    local tmp_file=$(mktemp)
    printf '%s' "$response" > "$tmp_file"
    jq -r '.data[0].status // .[0].status // 0' "$tmp_file"
    rm -f "$tmp_file"
}

# Parse result JSON string from query_result response
# The result field is a JSON string that needs to be parsed
# Uses temp file to avoid jq pipe issues with special characters on Windows
parse_query_result() {
    local response="$1"
    local tmp_file=$(mktemp)
    printf '%s' "$response" > "$tmp_file"
    jq -r '.data[0].result // .[0].result // "[]"' "$tmp_file"
    rm -f "$tmp_file"
}

# Extract audio file paths from result (returns newline-separated paths)
# Uses temp file to avoid jq pipe issues with special characters on Windows
parse_audio_files() {
    local result="$1"
    local tmp_file=$(mktemp)
    printf '%s' "$result" > "$tmp_file"
    jq -r '.[].file // empty' "$tmp_file" 2>/dev/null
    rm -f "$tmp_file"
}

# Extract metas value from result
# Uses temp file to avoid jq pipe issues with special characters on Windows
parse_metas_value() {
    local result="$1"
    local key="$2"
    local tmp_file=$(mktemp)
    printf '%s' "$result" > "$tmp_file"
    jq -r ".[0].metas.$key // .[0].$key // empty" "$tmp_file" 2>/dev/null
    rm -f "$tmp_file"
}

# Status command
cmd_status() {
    check_deps
    local job_id="$1"

    [ -z "$job_id" ] && { echo -e "${RED}Error: job_id required${NC}"; echo "Usage: $0 status <job_id>"; exit 1; }

    local api_url=$(ensure_connection)
    local response=$(query_job_result "$api_url" "$job_id")

    local status=$(parse_query_status "$response")
    echo "Job ID: $job_id"

    case "$status" in
        0)
            echo "Status: processing"
            ;;
        1)
            echo "Status: succeeded"
            echo ""
            local result_file=$(mktemp)
            parse_query_result "$response" > "$result_file"

            local bpm=$(jq -r '.[0].metas.bpm // .[0].bpm // empty' "$result_file" 2>/dev/null)
            local keyscale=$(jq -r '.[0].metas.keyscale // .[0].keyscale // empty' "$result_file" 2>/dev/null)
            local duration=$(jq -r '.[0].metas.duration // .[0].duration // empty' "$result_file" 2>/dev/null)

            echo "Result:"
            [ -n "$bpm" ] && echo "  BPM: $bpm"
            [ -n "$keyscale" ] && echo "  Key: $keyscale"
            [ -n "$duration" ] && echo "  Duration: ${duration}s"

            # Save and download
            save_result "$job_id" "$response"
            download_audios "$api_url" "$job_id" "$result_file"
            rm -f "$result_file"
            ;;
        2)
            echo "Status: failed"
            echo ""
            echo -e "${RED}Task failed${NC}"
            ;;
        *)
            echo "Status: unknown ($status)"
            ;;
    esac
}

# Download audio files from result file
# Usage: download_audios <api_url> <job_id> <result_file>
download_audios() {
    local api_url="$1"
    local job_id="$2"
    local result_file="$3"
    local api_key=$(load_api_key)

    ensure_output_dir

    local audio_format=$(get_config "generation.audio_format")
    [ -z "$audio_format" ] && audio_format="mp3"

    # Read result file content and extract audio paths using pipe (avoid temp file path issues on Windows)
    local result_content
    result_content=$(cat "$result_file" 2>/dev/null)

    if [ -z "$result_content" ]; then
        echo -e "  ${RED}Error: Result file is empty or cannot be read${NC}"
        return 1
    fi

    # Extract audio paths using pipe instead of file (better Windows compatibility)
    local audio_paths
    audio_paths=$(echo "$result_content" | jq -r '.[].file // empty' 2>&1)
    local jq_exit_code=$?

    if [ $jq_exit_code -ne 0 ]; then
        echo -e "  ${RED}Error: Failed to parse result JSON${NC}"
        echo -e "  ${RED}jq error: $audio_paths${NC}"
        return 1
    fi

    if [ -z "$audio_paths" ]; then
        echo -e "  ${YELLOW}No audio files found in result${NC}"
        return 0
    fi

    local count=1
    while IFS= read -r audio_path; do
        # Skip empty lines and remove potential Windows carriage return
        audio_path=$(echo "$audio_path" | tr -d '\r')
        if [ -n "$audio_path" ]; then
            local output_file="${OUTPUT_DIR}/${job_id}_${count}.${audio_format}"
            local download_url="${api_url}${audio_path}"

            echo -e "  ${CYAN}Downloading audio $count...${NC}"
            local curl_output
            local curl_exit_code
            if [ -n "$api_key" ]; then
                curl_output=$(curl -s --connect-timeout 10 --max-time 300 \
                    -w "%{http_code}" \
                    -o "$output_file" \
                    -H "Authorization: Bearer ${api_key}" \
                    "$download_url" 2>&1)
                curl_exit_code=$?
            else
                curl_output=$(curl -s --connect-timeout 10 --max-time 300 \
                    -w "%{http_code}" \
                    -o "$output_file" \
                    "$download_url" 2>&1)
                curl_exit_code=$?
            fi

            if [ $curl_exit_code -ne 0 ]; then
                echo -e "  ${RED}Failed to download (curl error $curl_exit_code): $download_url${NC}"
                rm -f "$output_file" 2>/dev/null
            elif [ -f "$output_file" ] && [ -s "$output_file" ]; then
                echo -e "  ${GREEN}Saved: $output_file${NC}"
            else
                echo -e "  ${RED}Failed to download (HTTP $curl_output): $download_url${NC}"
                rm -f "$output_file" 2>/dev/null
            fi
            count=$((count + 1))
        fi
    done <<< "$audio_paths"
}

# =============================================================================
# Completion Mode (OpenRouter /v1/chat/completions)
# =============================================================================

# Load api_mode from config (default: native)
load_api_mode() {
    local mode=$(get_config "api_mode")
    echo "${mode:-native}"
}

# Get model ID from /v1/models endpoint for completion mode
get_completion_model() {
    local api_url="$1"
    local user_model="$2"
    local api_key=$(load_api_key)

    # If user specified a model, prefix with acemusic/ if needed
    if [ -n "$user_model" ]; then
        if [[ "$user_model" == */* ]]; then
            echo "$user_model"
        else
            echo "acemusic/${user_model}"
        fi
        return
    fi

    # Query /v1/models for the first available model
    local response
    if [ -n "$api_key" ]; then
        response=$(curl -s -H "Authorization: Bearer ${api_key}" "${api_url}/v1/models" 2>/dev/null)
    else
        response=$(curl -s "${api_url}/v1/models" 2>/dev/null)
    fi

    local model_id
    model_id=$(echo "$response" | jq -r '.data[0].id // empty' 2>/dev/null)
    echo "${model_id:-acemusic/acestep-v15-turbo}"
}

# Decode base64 audio data URL and save to file
# Handles cross-platform compatibility (Linux/macOS/Windows MSYS)
decode_base64_audio() {
    local data_url="$1"
    local output_file="$2"

    # Strip data URL prefix: data:audio/mpeg;base64,...
    local b64_data="${data_url#data:*;base64,}"

    local tmp_b64=$(mktemp)
    printf '%s' "$b64_data" > "$tmp_b64"

    if command -v base64 &> /dev/null; then
        # Linux / macOS / MSYS2
        base64 -d < "$tmp_b64" > "$output_file" 2>/dev/null || \
        base64 -D < "$tmp_b64" > "$output_file" 2>/dev/null || \
        python3 -c "import base64,sys; sys.stdout.buffer.write(base64.b64decode(sys.stdin.read()))" < "$tmp_b64" > "$output_file" 2>/dev/null || \
        python -c "import base64,sys; sys.stdout.buffer.write(base64.b64decode(sys.stdin.read()))" < "$tmp_b64" > "$output_file" 2>/dev/null
    else
        # Fallback to python
        python3 -c "import base64,sys; sys.stdout.buffer.write(base64.b64decode(sys.stdin.read()))" < "$tmp_b64" > "$output_file" 2>/dev/null || \
        python -c "import base64,sys; sys.stdout.buffer.write(base64.b64decode(sys.stdin.read()))" < "$tmp_b64" > "$output_file" 2>/dev/null
    fi

    local decode_ok=$?
    rm -f "$tmp_b64"
    return $decode_ok
}

# Parse completion response: extract metadata, save audio files
# Usage: parse_completion_response <response_file> <job_id>
parse_completion_response() {
    local resp_file="$1"
    local job_id="$2"

    ensure_output_dir

    local audio_format=$(get_config "generation.audio_format")
    [ -z "$audio_format" ] && audio_format="mp3"

    # Check for error
    local finish_reason
    finish_reason=$(jq -r '.choices[0].finish_reason // "stop"' "$resp_file" 2>/dev/null)
    if [ "$finish_reason" = "error" ]; then
        local err_content
        err_content=$(jq -r '.choices[0].message.content // "Unknown error"' "$resp_file" 2>/dev/null)
        echo -e "${RED}Generation failed: $err_content${NC}"
        return 1
    fi

    # Extract and display text content (metadata + lyrics)
    local content
    content=$(jq -r '.choices[0].message.content // empty' "$resp_file" 2>/dev/null)
    if [ -n "$content" ]; then
        echo "$content"
        echo ""
    fi

    # Extract and save audio files
    local audio_count
    audio_count=$(jq -r '.choices[0].message.audio | length // 0' "$resp_file" 2>/dev/null)

    if [ "$audio_count" -gt 0 ] 2>/dev/null; then
        local i=0
        while [ "$i" -lt "$audio_count" ]; do
            local audio_url
            audio_url=$(jq -r ".choices[0].message.audio[$i].audio_url.url // empty" "$resp_file" 2>/dev/null)

            if [ -n "$audio_url" ]; then
                local output_file="${OUTPUT_DIR}/${job_id}_$((i+1)).${audio_format}"
                echo -e "  ${CYAN}Decoding audio $((i+1))...${NC}"

                if decode_base64_audio "$audio_url" "$output_file"; then
                    if [ -f "$output_file" ] && [ -s "$output_file" ]; then
                        echo -e "  ${GREEN}Saved: $output_file${NC}"
                    else
                        echo -e "  ${RED}Failed to decode audio $((i+1))${NC}"
                        rm -f "$output_file" 2>/dev/null
                    fi
                else
                    echo -e "  ${RED}Failed to decode audio $((i+1))${NC}"
                    rm -f "$output_file" 2>/dev/null
                fi
            fi
            i=$((i+1))
        done
    else
        echo -e "  ${YELLOW}No audio files in response${NC}"
    fi

    # Save full response JSON (strip base64 audio to keep file small)
    local clean_resp
    clean_resp=$(jq 'del(.choices[].message.audio[].audio_url.url)' "$resp_file" 2>/dev/null)
    if [ -n "$clean_resp" ]; then
        save_result "$job_id" "$clean_resp"
    else
        save_result "$job_id" "$(cat "$resp_file")"
    fi
}

# Send request to /v1/chat/completions and handle response
# Usage: send_completion_request <api_url> <payload_file> <job_id_var>
send_completion_request() {
    local api_url="$1"
    local payload_file="$2"
    local api_key=$(load_api_key)

    local resp_file=$(mktemp)

    local http_code
    if [ -n "$api_key" ]; then
        http_code=$(curl -s -w "%{http_code}" --connect-timeout 10 --max-time 660 \
            -o "$resp_file" \
            -X POST "${api_url}/v1/chat/completions" \
            -H "Content-Type: application/json; charset=utf-8" \
            -H "Authorization: Bearer ${api_key}" \
            -A "curl/8.7.1" \
            --data-binary "@${payload_file}")
    else
        http_code=$(curl -s -w "%{http_code}" --connect-timeout 10 --max-time 660 \
            -o "$resp_file" \
            -X POST "${api_url}/v1/chat/completions" \
            -H "Content-Type: application/json; charset=utf-8" \
            -A "curl/8.7.1" \
            --data-binary "@${payload_file}")
    fi

    rm -f "$payload_file"

    if [ "$http_code" != "200" ]; then
        local err_detail
        err_detail=$(jq -r '.detail // .error.message // empty' "$resp_file" 2>/dev/null)
        echo -e "${RED}Error: HTTP $http_code${NC}"
        [ -n "$err_detail" ] && echo -e "${RED}$err_detail${NC}"
        rm -f "$resp_file"
        return 1
    fi

    # Generate a job_id from the completion id
    local job_id
    job_id=$(jq -r '.id // empty' "$resp_file" 2>/dev/null)
    [ -z "$job_id" ] && job_id="completion-$(date +%s)"

    echo ""
    echo -e "${GREEN}Generation completed!${NC}"
    echo ""

    parse_completion_response "$resp_file" "$job_id"
    rm -f "$resp_file"

    echo ""
    echo -e "${GREEN}Done! Files saved to: $OUTPUT_DIR${NC}"
    show_star_prompt
}

# Wait for job and download results
wait_for_job() {
    local api_url="$1"
    local job_id="$2"

    echo "Job created: $job_id"
    echo "Output: $OUTPUT_DIR"
    echo ""

    while true; do
        local response=$(query_job_result "$api_url" "$job_id")
        local status=$(parse_query_status "$response")

        case "$status" in
            1)
                echo ""
                echo -e "${GREEN}Generation completed!${NC}"
                echo ""

                local result_file=$(mktemp)
                parse_query_result "$response" > "$result_file"

                local bpm=$(jq -r '.[0].metas.bpm // .[0].bpm // empty' "$result_file" 2>/dev/null)
                local keyscale=$(jq -r '.[0].metas.keyscale // .[0].keyscale // empty' "$result_file" 2>/dev/null)
                local duration=$(jq -r '.[0].metas.duration // .[0].duration // empty' "$result_file" 2>/dev/null)

                echo "Metadata:"
                [ -n "$bpm" ] && echo "  BPM: $bpm"
                [ -n "$keyscale" ] && echo "  Key: $keyscale"
                [ -n "$duration" ] && echo "  Duration: ${duration}s"
                echo ""

                # Save result JSON
                save_result "$job_id" "$response"

                # Download audio files
                echo "Downloading audio files..."
                download_audios "$api_url" "$job_id" "$result_file"
                rm -f "$result_file"

                echo ""
                echo -e "${GREEN}Done! Files saved to: $OUTPUT_DIR${NC}"
                show_star_prompt
                return 0
                ;;
            2)
                echo ""
                echo -e "${RED}Generation failed!${NC}"

                # Save error result
                save_result "$job_id" "$response"
                return 1
                ;;
            0)
                printf "\rProcessing...              "
                ;;
            *)
                printf "\rWaiting...                 "
                ;;
        esac
        sleep 5
    done
}

# Generate command
cmd_generate() {
    check_deps
    ensure_config

    local caption="" lyrics="" description="" thinking="" use_format=""
    local no_thinking=false no_format=false no_wait=false
    local model="" language="" steps="" guidance="" seed="" duration="" bpm="" batch=""
    local task_type="" src_audio="" cover_strength="" repaint_start="" repaint_end=""
    local key_scale="" time_signature=""

    while [[ $# -gt 0 ]]; do
        case $1 in
            --caption|-c) caption="$2"; shift 2 ;;
            --lyrics|-l) lyrics="$2"; shift 2 ;;
            --description|-d) description="$2"; shift 2 ;;
            --thinking|-t) thinking="true"; shift ;;
            --no-thinking) no_thinking=true; shift ;;
            --use-format) use_format="true"; shift ;;
            --no-format) no_format=true; shift ;;
            --model|-m) model="$2"; shift 2 ;;
            --language|--vocal-language) language="$2"; shift 2 ;;
            --steps) steps="$2"; shift 2 ;;
            --guidance) guidance="$2"; shift 2 ;;
            --seed) seed="$2"; shift 2 ;;
            --duration) duration="$2"; shift 2 ;;
            --bpm) bpm="$2"; shift 2 ;;
            --batch) batch="$2"; shift 2 ;;
            --no-wait) no_wait=true; shift ;;
            --task-type) task_type="$2"; shift 2 ;;
            --src-audio) src_audio="$2"; shift 2 ;;
            --cover-strength) cover_strength="$2"; shift 2 ;;
            --repaint-start) repaint_start="$2"; shift 2 ;;
            --repaint-end) repaint_end="$2"; shift 2 ;;
            --key-scale|--key) key_scale="$2"; shift 2 ;;
            --time-signature|--time-sig) time_signature="$2"; shift 2 ;;
            *) [ -z "$caption" ] && caption="$1"; shift ;;
        esac
    done

    # If no caption but has description, use simple mode
    if [ -z "$caption" ] && [ -z "$description" ]; then
        echo -e "${RED}Error: caption or description required${NC}"
        echo "Usage: $0 generate \"Music description\" [options]"
        echo "       $0 generate -d \"Simple description\" [options]"
        exit 1
    fi

    local api_url=$(ensure_connection)

    # Get defaults
    local def_thinking=$(get_config "generation.thinking")
    local def_format=$(get_config "generation.use_format")
    local def_cot_caption=$(get_config "generation.use_cot_caption")
    local def_cot_language=$(get_config "generation.use_cot_language")
    local def_language=$(get_config "generation.vocal_language")
    local def_audio_format=$(get_config "generation.audio_format")

    [ -z "$thinking" ] && thinking="${def_thinking:-true}"
    [ -z "$use_format" ] && use_format="${def_format:-true}"
    [ -z "$language" ] && language="${def_language:-en}"

    [ "$no_thinking" = true ] && thinking="false"
    [ "$no_format" = true ] && use_format="false"

    # Normalize boolean values for jq --argjson
    thinking=$(normalize_bool "$thinking" "true")
    use_format=$(normalize_bool "$use_format" "true")
    local cot_caption=$(normalize_bool "$def_cot_caption" "true")
    local cot_language=$(normalize_bool "$def_cot_language" "true")

    # Build payload using jq for proper escaping
    local payload=$(jq -n \
        --arg prompt "$caption" \
        --arg lyrics "${lyrics:-}" \
        --arg sample_query "${description:-}" \
        --argjson thinking "$thinking" \
        --argjson use_format "$use_format" \
        --argjson use_cot_caption "$cot_caption" \
        --argjson use_cot_language "$cot_language" \
        --arg vocal_language "$language" \
        --arg audio_format "${def_audio_format:-mp3}" \
        '{
            prompt: $prompt,
            lyrics: $lyrics,
            sample_query: $sample_query,
            thinking: $thinking,
            use_format: $use_format,
            use_cot_caption: $use_cot_caption,
            use_cot_language: $use_cot_language,
            vocal_language: $vocal_language,
            audio_format: $audio_format,
            use_random_seed: true
        }')

    # Validate src_audio file exists if provided
    if [ -n "$src_audio" ]; then
        if [ ! -f "$src_audio" ]; then
            echo -e "${RED}Error: Source audio file not found: $src_audio${NC}"
            exit 1
        fi
        # Default task_type to "cover" when src_audio is provided
        [ -z "$task_type" ] && task_type="cover"
    fi

    # Add optional parameters
    [ -n "$model" ] && payload=$(echo "$payload" | jq --arg v "$model" '. + {model: $v}')
    [ -n "$steps" ] && payload=$(echo "$payload" | jq --argjson v "$steps" '. + {inference_steps: $v}')
    [ -n "$guidance" ] && payload=$(echo "$payload" | jq --argjson v "$guidance" '. + {guidance_scale: $v}')
    [ -n "$seed" ] && payload=$(echo "$payload" | jq --argjson v "$seed" '. + {seed: $v, use_random_seed: false}')
    [ -n "$duration" ] && payload=$(echo "$payload" | jq --argjson v "$duration" '. + {audio_duration: $v}')
    [ -n "$bpm" ] && payload=$(echo "$payload" | jq --argjson v "$bpm" '. + {bpm: $v}')
    [ -n "$batch" ] && payload=$(echo "$payload" | jq --argjson v "$batch" '. + {batch_size: $v}')
    [ -n "$task_type" ] && payload=$(echo "$payload" | jq --arg v "$task_type" '. + {task_type: $v}')
    [ -n "$src_audio" ] && payload=$(echo "$payload" | jq --arg v "$src_audio" '. + {src_audio_path: $v}')
    [ -n "$cover_strength" ] && payload=$(echo "$payload" | jq --argjson v "$cover_strength" '. + {audio_cover_strength: $v}')
    [ -n "$repaint_start" ] && payload=$(echo "$payload" | jq --argjson v "$repaint_start" '. + {repainting_start: $v}')
    [ -n "$repaint_end" ] && payload=$(echo "$payload" | jq --argjson v "$repaint_end" '. + {repainting_end: $v}')
    [ -n "$key_scale" ] && payload=$(echo "$payload" | jq --arg v "$key_scale" '. + {key_scale: $v}')
    [ -n "$time_signature" ] && payload=$(echo "$payload" | jq --arg v "$time_signature" '. + {time_signature: $v}')

    local api_mode=$(load_api_mode)

    echo "Generating music..."
    if [ -n "$task_type" ] && [ "$task_type" != "text2music" ]; then
        echo "  Mode: $(echo "$task_type" | awk '{print toupper(substr($0,1,1)) substr($0,2)}') (${task_type})"
        [ -n "$src_audio" ] && echo "  Source audio: $src_audio"
    elif [ -n "$description" ]; then
        echo "  Mode: Simple (description)"
        echo "  Description: ${description:0:50}..."
    else
        echo "  Mode: Caption"
        echo "  Caption: ${caption:0:50}..."
    fi
    echo "  Thinking: $thinking, Format: $use_format"
    echo "  API: $api_mode"
    echo "  Output: $OUTPUT_DIR"
    echo ""

    if [ "$api_mode" = "completion" ]; then
        # --- Completion mode: /v1/chat/completions ---
        local model_id=$(get_completion_model "$api_url" "$model")

        # Build message content parts
        local message_content=""
        local sample_mode=false
        if [ -n "$description" ]; then
            message_content="$description"
            sample_mode=true
        else
            message_content="<prompt>${caption}</prompt>"
            [ -n "$lyrics" ] && message_content="${message_content}<lyrics>${lyrics}</lyrics>"
        fi

        # Build completion payload
        local payload_c
        if [ -n "$src_audio" ]; then
            # Audio input mode: use multipart content array with text + input_audio
            # Encode audio to base64 using python to avoid shell argument limits
            local audio_b64_file=$(mktemp)
            python3 -c "
import base64, sys
with open(sys.argv[1], 'rb') as f:
    sys.stdout.write(base64.b64encode(f.read()).decode('ascii'))
" "$src_audio" > "$audio_b64_file"

            local audio_ext="${src_audio##*.}"
            [ -z "$audio_ext" ] && audio_ext="mp3"

            # Build payload with audio using jq --rawfile to read base64 from file
            payload_c=$(jq -n \
                --arg model "$model_id" \
                --arg text_content "$message_content" \
                --rawfile audio_b64 "$audio_b64_file" \
                --arg audio_format "$audio_ext" \
                --argjson thinking "$thinking" \
                --argjson use_format "$use_format" \
                --argjson sample_mode "$sample_mode" \
                --argjson use_cot_caption "$cot_caption" \
                --argjson use_cot_language "$cot_language" \
                --arg vocal_language "$language" \
                --arg format "${def_audio_format:-mp3}" \
                --arg task_type "${task_type:-text2music}" \
                '{
                    model: $model,
                    messages: [{
                        "role": "user",
                        "content": [
                            {"type": "text", "text": $text_content},
                            {"type": "input_audio", "input_audio": {"data": $audio_b64, "format": $audio_format}}
                        ]
                    }],
                    stream: false,
                    thinking: $thinking,
                    use_format: $use_format,
                    sample_mode: $sample_mode,
                    use_cot_caption: $use_cot_caption,
                    use_cot_language: $use_cot_language,
                    task_type: $task_type,
                    audio_config: {
                        format: $format,
                        vocal_language: $vocal_language
                    }
                }')

            rm -f "$audio_b64_file"

            # Add cover/repainting parameters
            [ -n "$cover_strength" ] && payload_c=$(echo "$payload_c" | jq --argjson v "$cover_strength" '. + {audio_cover_strength: $v}')
            [ -n "$repaint_start" ] && payload_c=$(echo "$payload_c" | jq --argjson v "$repaint_start" '. + {repainting_start: $v}')
            [ -n "$repaint_end" ] && payload_c=$(echo "$payload_c" | jq --argjson v "$repaint_end" '. + {repainting_end: $v}')
        else
            # Text-only mode: use string content
            payload_c=$(jq -n \
                --arg model "$model_id" \
                --arg content "$message_content" \
                --argjson thinking "$thinking" \
                --argjson use_format "$use_format" \
                --argjson sample_mode "$sample_mode" \
                --argjson use_cot_caption "$cot_caption" \
                --argjson use_cot_language "$cot_language" \
                --arg vocal_language "$language" \
                --arg format "${def_audio_format:-mp3}" \
                '{
                    model: $model,
                    messages: [{"role": "user", "content": $content}],
                    stream: false,
                    thinking: $thinking,
                    use_format: $use_format,
                    sample_mode: $sample_mode,
                    use_cot_caption: $use_cot_caption,
                    use_cot_language: $use_cot_language,
                    audio_config: {
                        format: $format,
                        vocal_language: $vocal_language
                    }
                }')
        fi

        # Add optional parameters to completion payload
        [ -n "$guidance" ] && payload_c=$(echo "$payload_c" | jq --argjson v "$guidance" '. + {guidance_scale: $v}')
        [ -n "$seed" ] && payload_c=$(echo "$payload_c" | jq --argjson v "$seed" '. + {seed: $v}')
        [ -n "$batch" ] && payload_c=$(echo "$payload_c" | jq --argjson v "$batch" '. + {batch_size: $v}')
        [ -n "$duration" ] && payload_c=$(echo "$payload_c" | jq --argjson v "$duration" '.audio_config.duration = $v')
        [ -n "$bpm" ] && payload_c=$(echo "$payload_c" | jq --argjson v "$bpm" '.audio_config.bpm = $v')
        [ -n "$key_scale" ] && payload_c=$(echo "$payload_c" | jq --arg v "$key_scale" '.audio_config.key_scale = $v')
        [ -n "$time_signature" ] && payload_c=$(echo "$payload_c" | jq --arg v "$time_signature" '.audio_config.time_signature = $v')

        local temp_payload=$(mktemp)
        printf '%s' "$payload_c" > "$temp_payload"

        send_completion_request "$api_url" "$temp_payload"
    else
        # --- Native mode: /release_task + polling ---
        local temp_payload=$(mktemp)
        printf '%s' "$payload" > "$temp_payload"

        local api_key=$(load_api_key)
        local response
        if [ -n "$api_key" ]; then
            response=$(curl -s -X POST "${api_url}/release_task" \
                -H "Content-Type: application/json; charset=utf-8" \
                -H "Authorization: Bearer ${api_key}" \
                --data-binary "@${temp_payload}")
        else
            response=$(curl -s -X POST "${api_url}/release_task" \
                -H "Content-Type: application/json; charset=utf-8" \
                --data-binary "@${temp_payload}")
        fi

        rm -f "$temp_payload"

        local job_id=$(echo "$response" | jq -r '.data.task_id // .task_id // empty')
        [ -z "$job_id" ] && { echo -e "${RED}Error: Failed to create job${NC}"; echo "$response"; exit 1; }

        if [ "$no_wait" = true ]; then
            echo "Job ID: $job_id"
            echo "Use '$0 status $job_id' to check progress and download"
        else
            wait_for_job "$api_url" "$job_id"
        fi
    fi
}

# Random command
cmd_random() {
    check_deps
    ensure_config

    local thinking="" no_thinking=false no_wait=false

    while [[ $# -gt 0 ]]; do
        case $1 in
            --thinking|-t) thinking="true"; shift ;;
            --no-thinking) no_thinking=true; shift ;;
            --no-wait) no_wait=true; shift ;;
            *) shift ;;
        esac
    done

    local api_url=$(ensure_connection)

    local def_thinking=$(get_config "generation.thinking")
    [ -z "$thinking" ] && thinking="${def_thinking:-true}"
    [ "$no_thinking" = true ] && thinking="false"

    # Normalize boolean for jq --argjson
    thinking=$(normalize_bool "$thinking" "true")

    local api_mode=$(load_api_mode)

    echo "Generating random music..."
    echo "  Thinking: $thinking"
    echo "  API: $api_mode"
    echo "  Output: $OUTPUT_DIR"
    echo ""

    if [ "$api_mode" = "completion" ]; then
        # --- Completion mode ---
        local model_id=$(get_completion_model "$api_url" "")
        local def_audio_format=$(get_config "generation.audio_format")

        local payload_c=$(jq -n \
            --arg model "$model_id" \
            --argjson thinking "$thinking" \
            --arg format "${def_audio_format:-mp3}" \
            '{
                model: $model,
                messages: [{"role": "user", "content": "Generate a random song"}],
                stream: false,
                sample_mode: true,
                thinking: $thinking,
                audio_config: { format: $format }
            }')

        local temp_payload=$(mktemp)
        printf '%s' "$payload_c" > "$temp_payload"

        send_completion_request "$api_url" "$temp_payload"
    else
        # --- Native mode ---
        local payload=$(jq -n --argjson thinking "$thinking" '{sample_mode: true, thinking: $thinking}')

        local temp_payload=$(mktemp)
        printf '%s' "$payload" > "$temp_payload"

        local api_key=$(load_api_key)
        local response
        if [ -n "$api_key" ]; then
            response=$(curl -s -X POST "${api_url}/release_task" \
                -H "Content-Type: application/json; charset=utf-8" \
                -H "Authorization: Bearer ${api_key}" \
                --data-binary "@${temp_payload}")
        else
            response=$(curl -s -X POST "${api_url}/release_task" \
                -H "Content-Type: application/json; charset=utf-8" \
                --data-binary "@${temp_payload}")
        fi

        rm -f "$temp_payload"

        local job_id=$(echo "$response" | jq -r '.data.task_id // .task_id // empty')
        [ -z "$job_id" ] && { echo -e "${RED}Error: Failed to create job${NC}"; echo "$response"; exit 1; }

        if [ "$no_wait" = true ]; then
            echo "Job ID: $job_id"
            echo "Use '$0 status $job_id' to check progress and download"
        else
            wait_for_job "$api_url" "$job_id"
        fi
    fi
}

# Cover command (shortcut for generate --task-type cover --src-audio)
cmd_cover() {
    check_deps
    ensure_config

    local src_audio=""
    local args=()

    # Extract src_audio as first positional arg, pass rest to generate
    while [[ $# -gt 0 ]]; do
        case $1 in
            --src-audio) src_audio="$2"; shift 2 ;;
            -*) args+=("$1"); shift ;;
            *)
                if [ -z "$src_audio" ]; then
                    src_audio="$1"; shift
                else
                    args+=("$1"); shift
                fi
                ;;
        esac
    done

    if [ -z "$src_audio" ]; then
        echo -e "${RED}Error: source audio file required${NC}"
        echo "Usage: $0 cover <audio_file> -c \"caption\" -l \"lyrics\" [options]"
        exit 1
    fi

    cmd_generate --src-audio "$src_audio" --task-type cover "${args[@]}"
}

# Help
show_help() {
    echo "ACE-Step Music Generation CLI"
    echo ""
    echo "Requirements: curl, jq"
    echo ""
    echo "Usage: $0 <command> [options]"
    echo ""
    echo "Commands:"
    echo "  generate    Generate music from text"
    echo "  cover       Cover/repainting from source audio"
    echo "  random      Generate random music"
    echo "  status      Check job status and download results"
    echo "  models      List available models"
    echo "  health      Check API health"
    echo "  config      Manage configuration"
    echo ""
    echo "Output:"
    echo "  Results saved to: $OUTPUT_DIR/<job_id>.json"
    echo "  Audio files: $OUTPUT_DIR/<job_id>_1.mp3, ..."
    echo ""
    echo "Generate Options:"
    echo "  -c, --caption       Music style/genre description (caption mode)"
    echo "  -d, --description   Simple description, LM auto-generates caption/lyrics"
    echo "  -l, --lyrics        Lyrics text"
    echo "  -t, --thinking      Enable thinking mode (default: true)"
    echo "  --no-thinking       Disable thinking mode"
    echo "  --no-format         Disable format enhancement"
    echo "  --duration          Duration in seconds"
    echo "  --bpm               Beats per minute"
    echo "  --key-scale         Musical key (e.g. \"E minor\")"
    echo "  --time-signature    Time signature (e.g. \"4/4\")"
    echo ""
    echo "Cover/Repainting Options:"
    echo "  --src-audio         Source audio file path"
    echo "  --task-type         Task type: cover, repaint, text2music (default: auto)"
    echo "  --cover-strength    Cover strength 0.0-1.0 (default: 1.0)"
    echo "  --repaint-start     Repainting start position in seconds"
    echo "  --repaint-end       Repainting end position in seconds"
    echo ""
    echo "Examples:"
    echo "  $0 generate \"Pop music with guitar\"           # Caption mode"
    echo "  $0 generate -d \"A February love song\"         # Simple mode (LM generates)"
    echo "  $0 generate -c \"Jazz\" -l \"[Verse] Hello\"      # With lyrics"
    echo "  $0 cover song.mp3 -c \"Rock cover\" -l \"[Verse] ...\" --duration 120"
    echo "  $0 generate --src-audio song.mp3 --task-type repaint -c \"Pop\" --repaint-start 30 --repaint-end 60"
    echo "  $0 random"
    echo "  $0 status <job_id>"
    echo "  $0 config --set generation.thinking false"
}

# Main
case "$1" in
    generate) shift; cmd_generate "$@" ;;
    cover) shift; cmd_cover "$@" ;;
    random) shift; cmd_random "$@" ;;
    status) shift; cmd_status "$@" ;;
    models) cmd_models ;;
    health) cmd_health ;;
    config) shift; cmd_config "$@" ;;
    help|--help|-h) show_help ;;
    *) show_help; exit 1 ;;
esac

```



---

## Skill Companion Files

> Additional files collected from the skill directory layout.

### scripts/config.example.json

```json
{
  "api_url": "https://api.acemusic.ai",
  "api_key": "",
  "api_mode": "completion",
  "generation": {
    "thinking": true,
    "use_format": false,
    "use_cot_caption": true,
    "use_cot_language": false,
    "audio_format": "mp3",
    "batch_size": 1,
    "vocal_language": "en"
  }
}

```

acestep | SkillHub