phoenix-cli
Debug LLM applications using the Phoenix CLI. Fetch traces, analyze errors, review experiments, and inspect datasets. Use when debugging AI/LLM applications, analyzing trace data, working with Phoenix observability, or investigating LLM performance issues.
Packaged view
This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.
Install command
npx @skill-hub/cli install arize-ai-phoenix-phoenix-cli
Repository
Skill path: skills/phoenix-cli
Debug LLM applications using the Phoenix CLI. Fetch traces, analyze errors, review experiments, and inspect datasets. Use when debugging AI/LLM applications, analyzing trace data, working with Phoenix observability, or investigating LLM performance issues.
Open repositoryBest for
Primary workflow: Analyze Data & AI.
Technical facets: Full Stack, Data / AI, Testing.
Target audience: everyone.
License: Apache-2.0.
Original source
Catalog source: SkillHub Club.
Repository owner: Arize-ai.
This is still a mirrored public skill entry. Review the repository before installing into production workflows.
What it helps with
- Install phoenix-cli into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
- Review https://github.com/Arize-ai/phoenix before adding phoenix-cli to shared team environments
- Use phoenix-cli for development workflows
Works across
Favorites: 0.
Sub-skills: 0.
Aggregator: No.
Original source / Raw SKILL.md
---
name: phoenix-cli
description: Debug LLM applications using the Phoenix CLI. Fetch traces, analyze errors, review experiments, and inspect datasets. Use when debugging AI/LLM applications, analyzing trace data, working with Phoenix observability, or investigating LLM performance issues.
license: Apache-2.0
metadata:
author: arize-ai
version: "1.0"
---
# Phoenix CLI
Debug and analyze LLM applications using the Phoenix CLI (`px`).
## Quick Start
### Installation
```bash
npm install -g @arizeai/phoenix-cli
# Or run directly with npx
npx @arizeai/phoenix-cli
```
### Configuration
Set environment variables before running commands:
```bash
export PHOENIX_HOST=http://localhost:6006
export PHOENIX_PROJECT=my-project
export PHOENIX_API_KEY=your-api-key # if authentication is enabled
```
CLI flags override environment variables when specified.
## Debugging Workflows
### Debug a failing LLM application
1. Fetch recent traces to see what's happening:
```bash
px traces --limit 10
```
2. Find failed traces:
```bash
px traces --limit 50 --format raw --no-progress | jq '.[] | select(.status == "ERROR")'
```
3. Get details on a specific trace:
```bash
px trace <trace-id>
```
4. Look for errors in spans:
```bash
px trace <trace-id> --format raw | jq '.spans[] | select(.status_code != "OK")'
```
### Find performance issues
1. Get the slowest traces:
```bash
px traces --limit 20 --format raw --no-progress | jq 'sort_by(-.duration) | .[0:5]'
```
2. Analyze span durations within a trace:
```bash
px trace <trace-id> --format raw | jq '.spans | sort_by(-.duration_ms) | .[0:5] | .[] | {name, duration_ms, span_kind}'
```
### Analyze LLM usage
Extract models and token counts:
```bash
px traces --limit 50 --format raw --no-progress | \
jq -r '.[].spans[] | select(.span_kind == "LLM") | {model: .attributes["llm.model_name"], prompt_tokens: .attributes["llm.token_count.prompt"], completion_tokens: .attributes["llm.token_count.completion"]}'
```
### Review experiment results
1. List datasets:
```bash
px datasets
```
2. List experiments for a dataset:
```bash
px experiments --dataset my-dataset
```
3. Analyze experiment failures:
```bash
px experiment <experiment-id> --format raw --no-progress | \
jq '.[] | select(.error != null) | {input: .input, error}'
```
4. Calculate average latency:
```bash
px experiment <experiment-id> --format raw --no-progress | \
jq '[.[].latency_ms] | add / length'
```
## Command Reference
### px traces
Fetch recent traces from a project.
```bash
px traces [directory] [options]
```
| Option | Description |
|--------|-------------|
| `[directory]` | Save traces as JSON files to directory |
| `-n, --limit <number>` | Number of traces (default: 10) |
| `--last-n-minutes <number>` | Filter by time window |
| `--since <timestamp>` | Fetch since ISO timestamp |
| `--format <format>` | `pretty`, `json`, or `raw` |
| `--include-annotations` | Include span annotations |
### px trace
Fetch a specific trace by ID.
```bash
px trace <trace-id> [options]
```
| Option | Description |
|--------|-------------|
| `--file <path>` | Save to file |
| `--format <format>` | `pretty`, `json`, or `raw` |
| `--include-annotations` | Include span annotations |
### px datasets
List all datasets.
```bash
px datasets [options]
```
### px dataset
Fetch examples from a dataset.
```bash
px dataset <dataset-name> [options]
```
| Option | Description |
|--------|-------------|
| `--split <name>` | Filter by split (repeatable) |
| `--version <id>` | Specific dataset version |
| `--file <path>` | Save to file |
### px experiments
List experiments for a dataset.
```bash
px experiments --dataset <name> [directory]
```
| Option | Description |
|--------|-------------|
| `--dataset <name>` | Dataset name or ID (required) |
| `[directory]` | Export experiment JSON to directory |
### px experiment
Fetch a single experiment with run data.
```bash
px experiment <experiment-id> [options]
```
### px prompts
List all prompts.
```bash
px prompts [options]
```
### px prompt
Fetch a specific prompt.
```bash
px prompt <prompt-name> [options]
```
## Output Formats
- **`pretty`** (default): Human-readable tree view
- **`json`**: Formatted JSON with indentation
- **`raw`**: Compact JSON for piping to `jq` or other tools
Use `--format raw --no-progress` when piping output to other commands.
## Trace Structure
Traces contain spans with OpenInference semantic attributes:
```json
{
"traceId": "abc123",
"spans": [{
"name": "chat_completion",
"span_kind": "LLM",
"status_code": "OK",
"attributes": {
"llm.model_name": "gpt-4",
"llm.token_count.prompt": 512,
"llm.token_count.completion": 256,
"input.value": "What is the weather?",
"output.value": "The weather is sunny..."
}
}],
"duration": 1250,
"status": "OK"
}
```
Key span kinds: `LLM`, `CHAIN`, `TOOL`, `RETRIEVER`, `EMBEDDING`, `AGENT`.
Key attributes for LLM spans:
- `llm.model_name`: Model used
- `llm.provider`: Provider name (e.g., "openai")
- `llm.token_count.prompt` / `llm.token_count.completion`: Token counts
- `llm.input_messages.*`: Input messages (indexed, with role and content)
- `llm.output_messages.*`: Output messages (indexed, with role and content)
- `input.value` / `output.value`: Raw input/output as text
- `exception.message`: Error message if failed