temporal
Manage Temporal workflows: server lifecycle, worker processes, workflow execution, monitoring, and troubleshooting for Python SDK with temporal server start-dev.
Packaged view
This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.
Install command
npx @skill-hub/cli install steveandroulakis-temporal-conductor-migration-agent-temporal
Repository
Skill path: .claude/skills/temporal
Manage Temporal workflows: server lifecycle, worker processes, workflow execution, monitoring, and troubleshooting for Python SDK with temporal server start-dev.
Open repositoryBest for
Primary workflow: Ship Full Stack.
Technical facets: Full Stack, Backend.
Target audience: everyone.
License: Unknown.
Original source
Catalog source: SkillHub Club.
Repository owner: steveandroulakis.
This is still a mirrored public skill entry. Review the repository before installing into production workflows.
What it helps with
- Install temporal into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
- Review https://github.com/steveandroulakis/temporal-conductor-migration-agent before adding temporal to shared team environments
- Use temporal for development workflows
Works across
Favorites: 0.
Sub-skills: 0.
Aggregator: No.
Original source / Raw SKILL.md
---
name: temporal
description: "Manage Temporal workflows: server lifecycle, worker processes, workflow execution, monitoring, and troubleshooting for Python SDK with temporal server start-dev."
version: 1.0.1
allowed-tools: "Bash(.claude/skills/temporal/scripts/*:*), Read"
---
# Temporal Skill
Manage Temporal workflows using local development server. This skill focuses on the **execution, validation, and troubleshooting** lifecycle of workflows.
| Property | Value |
|----------|-------|
| Target SDK | Python only |
| Server Type | `temporal server start-dev` (local development) |
| gRPC Port | 7233 |
---
## Critical Concepts
Understanding how Temporal components interact is essential for troubleshooting:
### How Workers, Workflows, and Tasks Relate
```
┌─────────────────────────────────────────────────────────────────┐
│ TEMPORAL SERVER │
│ Stores workflow history, manages task queues, coordinates work │
└─────────────────────────────────────────────────────────────────┘
│
Task Queue (named queue)
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ WORKER │
│ Long-running process that polls task queue for work │
│ Contains: Workflow definitions + Activity implementations │
│ │
│ When work arrives: │
│ - Workflow Task → Execute workflow code decisions │
│ - Activity Task → Execute activity code (business logic) │
└─────────────────────────────────────────────────────────────────┘
```
**Key Insight**: The workflow code runs inside the worker. If worker code is outdated or buggy, workflow execution fails.
### Workflow Task vs Activity Task
| Task Type | What It Does | Where It Runs | On Failure |
|-----------|--------------|---------------|------------|
| **Workflow Task** | Makes workflow decisions (what to do next) | Worker | **Stalls the workflow** until fixed |
| **Activity Task** | Executes business logic | Worker | Retries per retry policy |
**CRITICAL**: Workflow Task errors are fundamentally different from Activity Task errors:
- **Workflow Task Failure** → Workflow **stops making progress entirely**
- **Activity Task Failure** → Workflow **retries the activity** (workflow still progressing)
---
## Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `CLAUDE_TEMPORAL_LOG_DIR` | `/tmp/claude-temporal-logs` | Directory for worker log files |
| `CLAUDE_TEMPORAL_PID_DIR` | `/tmp/claude-temporal-pids` | Directory for worker PID files |
| `CLAUDE_TEMPORAL_PROJECT_DIR` | `$(pwd)` | Project root directory |
| `CLAUDE_TEMPORAL_PROJECT_NAME` | `$(basename "$PWD")` | Project name (used for log/PID naming) |
| `CLAUDE_TEMPORAL_NAMESPACE` | `default` | Temporal namespace |
| `TEMPORAL_ADDRESS` | `localhost:7233` | Temporal server gRPC address |
| `TEMPORAL_CLI` | `temporal` | Path to Temporal CLI binary |
| `TEMPORAL_WORKER_CMD` | `uv run worker` | Command to start worker |
---
## Quick Start
```bash
# 1. Start server
./scripts/ensure-server.sh
# 2. Start worker (ensures no old workers, starts fresh one)
./scripts/ensure-worker.sh
# 3. Execute workflow
uv run starter # Capture workflow_id from output
# 4. Wait for completion
./scripts/wait-for-workflow-status.sh --workflow-id <id> --status COMPLETED
# 5. Get result (IMPORTANT: verify result is correct, not an error message)
./scripts/get-workflow-result.sh --workflow-id <id>
# 6. CLEANUP: Kill workers when done
./scripts/kill-worker.sh
```
---
## Worker Management
### The Golden Rule
**Ensure no old workers are running.** Stale workers with outdated code cause:
- Non-determinism errors (history mismatch)
- Executing old buggy code
- Confusing behavior
**Best practice**: Run only ONE worker instance with the latest code.
### Starting Workers
```bash
# PREFERRED: Smart restart (kills old, starts fresh)
./scripts/ensure-worker.sh
```
This command:
1. Finds ALL existing workers for the project
2. Kills them
3. Starts a new worker with fresh code
4. Waits for worker to be ready
### Verifying Workers
```bash
# List all running workers
./scripts/list-workers.sh
# Check specific worker health
./scripts/monitor-worker-health.sh
# View worker logs
tail -f $CLAUDE_TEMPORAL_LOG_DIR/worker-$(basename "$(pwd)").log
```
**What to look for in logs**:
- `Worker started, listening on task queue: ...` → Worker is ready
- `Worker process died during startup` → Startup failure, check logs for error
### Cleanup (REQUIRED)
**Always kill workers when done.** Don't leave workers running.
```bash
# Kill current project's worker
./scripts/kill-worker.sh
# Kill ALL workers (full cleanup)
./scripts/kill-all-workers.sh
# Kill all workers AND server
./scripts/kill-all-workers.sh --include-server
```
---
## Workflow Execution
### Starting Workflows
```bash
# Execute workflow via starter script
uv run starter
```
**CRITICAL**: Capture the Workflow ID from output. You need it for all monitoring/troubleshooting.
### Checking Status
```bash
# Get workflow status
temporal workflow describe --workflow-id <id>
# Wait for specific status
./scripts/wait-for-workflow-status.sh \
--workflow-id <id> \
--status COMPLETED \
--timeout 60
```
### Workflow Status Reference
| Status | Meaning | Action |
|--------|---------|--------|
| `RUNNING` | Workflow in progress | Wait, or check if stalled |
| `COMPLETED` | Successfully finished | Get result, verify correctness |
| `FAILED` | Error during execution | Analyze error |
| `CANCELED` | Explicitly canceled | Review reason |
| `TERMINATED` | Force-stopped | Review reason |
| `TIMED_OUT` | Exceeded timeout | Increase timeout |
### Getting Results
```bash
./scripts/get-workflow-result.sh --workflow-id <id>
```
**IMPORTANT - False Positive Detection**:
Workflows may `COMPLETE` but return undesired results (e.g., error messages in the result payload).
```json
// This workflow COMPLETED but the result is an ERROR!
{"status": "error", "message": "Failed to process request"}
```
**Always verify the result content is correct**, not just that the status is COMPLETED.
---
## Troubleshooting
### Step 1: Identify the Problem
```bash
# Check workflow status
temporal workflow describe --workflow-id <id>
# Check for stalled workflows (workflows stuck in RUNNING)
./scripts/find-stalled-workflows.sh
# Analyze specific workflow errors
./scripts/analyze-workflow-error.sh --workflow-id <id>
```
### Step 2: Diagnose Using This Decision Tree
```
Workflow not behaving as expected?
│
├── Status: RUNNING but no progress (STALLED)
│ │
│ ├── Is it an interactive workflow waiting for signal/update?
│ │ └── YES → Send the required interaction
│ │
│ └── NO → Run: ./scripts/find-stalled-workflows.sh
│ │
│ ├── WorkflowTaskFailed detected
│ │ │
│ │ ├── Non-determinism error (history mismatch)?
│ │ │ └── See: "Fixing Non-Determinism Errors" below
│ │ │
│ │ └── Other workflow task error (code bug, missing registration)?
│ │ └── See: "Fixing Other Workflow Task Errors" below
│ │
│ └── ActivityTaskFailed (excessive retries)
│ └── Activity is retrying. Fix activity code, restart worker.
│ Workflow will auto-retry with new code.
│
├── Status: COMPLETED but wrong result
│ └── Check result: ./scripts/get-workflow-result.sh --workflow-id <id>
│ Is result an error message? → Fix workflow/activity logic
│
├── Status: FAILED
│ └── Run: ./scripts/analyze-workflow-error.sh --workflow-id <id>
│ Fix code → ./scripts/ensure-worker.sh → Start NEW workflow
│
├── Status: TIMED_OUT
│ └── Increase timeouts → ./scripts/ensure-worker.sh → Start NEW workflow
│
└── Workflow never starts
└── Check: Worker running? Task queue matches? Workflow registered?
```
---
## Fixing Workflow Task Errors
**Workflow task errors STALL the workflow** - it stops making progress entirely until the issue is fixed.
### Fixing Non-Determinism Errors
Non-determinism occurs when workflow code changes while a workflow is running, causing history mismatch.
**Symptoms**:
- `WorkflowTaskFailed` events in history
- "Non-deterministic error" or "history mismatch" in logs
**Fix procedure**:
```bash
# 1. TERMINATE affected workflows (they cannot recover)
temporal workflow terminate --workflow-id <id>
# 2. Kill existing workers
./scripts/kill-worker.sh
# 3. Fix the workflow code if needed
# 4. Restart worker with corrected code
./scripts/ensure-worker.sh
# 5. Verify workflow logic is correct
# 6. Start NEW workflow execution
uv run starter
```
**Key point**: Non-determinism corrupts the workflow. You MUST terminate and start fresh.
### Fixing Other Workflow Task Errors
For workflow task errors that are NOT non-determinism (code bugs, missing registration, etc.):
**Symptoms**:
- `WorkflowTaskFailed` events
- Error is NOT "history mismatch" or "non-deterministic"
**Fix procedure**:
```bash
# 1. Identify the error
./scripts/analyze-workflow-error.sh --workflow-id <id>
# 2. Fix the root cause (code bug, worker config, etc.)
# 3. Kill and restart worker with fixed code
./scripts/ensure-worker.sh
# 4. NO NEED TO TERMINATE - the workflow will automatically resume
# The new worker picks up where it left off and continues execution
```
**Key point**: Unlike non-determinism, the workflow can recover once you fix the code.
---
## Fixing Activity Task Errors
**Activity task errors cause retries**, not immediate workflow failure.
### Workflow Stalling Due to Retries
Workflows can appear stalled because an activity keeps failing and retrying.
**Diagnosis**:
```bash
# Check for excessive activity retries
./scripts/find-stalled-workflows.sh
# Look for ActivityTaskFailed count
# Check worker logs for retry messages
tail -100 $CLAUDE_TEMPORAL_LOG_DIR/worker-$(basename "$(pwd)").log
```
**Fix procedure**:
```bash
# 1. Fix the activity code
# 2. Restart worker with fixed code
./scripts/ensure-worker.sh
# 3. Worker auto-retries with new code
# No need to terminate or restart workflow
```
### Activity Failure (Retries Exhausted)
When all retries are exhausted, the activity fails permanently.
**Fix procedure**:
```bash
# 1. Analyze the error
./scripts/analyze-workflow-error.sh --workflow-id <id>
# 2. Fix activity code
# 3. Restart worker
./scripts/ensure-worker.sh
# 4. Start NEW workflow (old one has failed)
uv run starter
```
---
## Common Error Types Reference
| Error Type | Where to Find | What Happened | Recovery |
|------------|---------------|---------------|----------|
| **Non-determinism** | `WorkflowTaskFailed` in history | Code changed during execution | Terminate workflow → Fix → Restart worker → NEW workflow |
| **Workflow code bug** | `WorkflowTaskFailed` in history | Bug in workflow logic | Fix code → Restart worker → Workflow auto-resumes |
| **Missing workflow** | Worker logs | Workflow not registered | Add to worker.py → Restart worker |
| **Missing activity** | Worker logs | Activity not registered | Add to worker.py → Restart worker |
| **Activity bug** | `ActivityTaskFailed` in history | Bug in activity code | Fix code → Restart worker → Auto-retries |
| **Activity retries** | `ActivityTaskFailed` (count >2) | Repeated failures | Fix code → Restart worker → Auto-retries |
| **Sandbox violation** | Worker logs | Bad imports in workflow | Fix workflow.py imports → Restart worker |
| **Task queue mismatch** | Workflow never starts | Different queues in starter/worker | Align task queue names |
| **Timeout** | Status = TIMED_OUT | Operation too slow | Increase timeout config |
---
## Interactive Workflows
Interactive workflows pause and wait for external input (signals or updates).
### Signals
```bash
# Send signal to workflow
temporal workflow signal \
--workflow-id <id> \
--name "signal_name" \
--input '{"key": "value"}'
# Or via interact script (if available)
uv run interact --workflow-id <id> --signal-name "signal_name" --data '{"key": "value"}'
```
### Updates
```bash
# Send update to workflow
temporal workflow update \
--workflow-id <id> \
--name "update_name" \
--input '{"approved": true}'
```
### Queries
```bash
# Query workflow state (read-only)
temporal workflow query \
--workflow-id <id> \
--name "get_status"
```
---
## Common Recipes
### Recipe 1: Clean Start (Fresh Environment)
```bash
./scripts/kill-all-workers.sh
./scripts/ensure-server.sh
./scripts/ensure-worker.sh
uv run starter
```
### Recipe 2: Debug Stalled Workflow
```bash
# 1. Find what's wrong
./scripts/find-stalled-workflows.sh
./scripts/analyze-workflow-error.sh --workflow-id <id>
# 2. Check worker logs
tail -100 $CLAUDE_TEMPORAL_LOG_DIR/worker-$(basename "$(pwd)").log
# 3. Fix based on error type (see decision tree above)
```
### Recipe 3: Clear Stalled Environment
```bash
./scripts/find-stalled-workflows.sh
./scripts/bulk-cancel-workflows.sh
./scripts/kill-worker.sh
./scripts/ensure-worker.sh
```
### Recipe 4: Test Interactive Workflow
```bash
./scripts/ensure-worker.sh
uv run starter # Get workflow_id
./scripts/wait-for-workflow-status.sh --workflow-id $workflow_id --status RUNNING
uv run interact --workflow-id $workflow_id --signal-name "approval" --data '{"approved": true}'
./scripts/wait-for-workflow-status.sh --workflow-id $workflow_id --status COMPLETED
./scripts/get-workflow-result.sh --workflow-id $workflow_id
./scripts/kill-worker.sh # CLEANUP
```
### Recipe 5: Check Recent Workflow Results
```bash
# List recent workflows
./scripts/list-recent-workflows.sh --minutes 30
# Check results (verify they're correct, not error messages!)
./scripts/get-workflow-result.sh --workflow-id <id1>
./scripts/get-workflow-result.sh --workflow-id <id2>
```
---
## Tool Reference
### Lifecycle Scripts
| Tool | Description | Key Options |
|------|-------------|-------------|
| `ensure-server.sh` | Start dev server if not running | - |
| `ensure-worker.sh` | Kill old workers, start fresh one | Uses `$TEMPORAL_WORKER_CMD` |
| `kill-worker.sh` | Kill current project's worker | - |
| `kill-all-workers.sh` | Kill all workers | `--include-server` |
| `list-workers.sh` | List running workers | - |
### Monitoring Scripts
| Tool | Description | Key Options |
|------|-------------|-------------|
| `list-recent-workflows.sh` | Show recent executions | `--minutes N` (default: 5) |
| `find-stalled-workflows.sh` | Detect stalled workflows | `--query "..."` |
| `monitor-worker-health.sh` | Check worker status | - |
| `wait-for-workflow-status.sh` | Block until status | `--workflow-id`, `--status`, `--timeout` |
### Debugging Scripts
| Tool | Description | Key Options |
|------|-------------|-------------|
| `analyze-workflow-error.sh` | Extract errors from history | `--workflow-id`, `--run-id` |
| `get-workflow-result.sh` | Get workflow output | `--workflow-id`, `--raw` |
| `bulk-cancel-workflows.sh` | Mass cancellation | `--pattern "..."` |
---
## Log Files
| Log | Location | Content |
|-----|----------|---------|
| Worker logs | `$CLAUDE_TEMPORAL_LOG_DIR/worker-{project}.log` | Worker output, activity logs, errors |
**Useful searches**:
```bash
# Find errors
grep -i "error" $CLAUDE_TEMPORAL_LOG_DIR/worker-*.log
# Check worker startup
grep -i "started" $CLAUDE_TEMPORAL_LOG_DIR/worker-*.log
# Find activity issues
grep -i "activity" $CLAUDE_TEMPORAL_LOG_DIR/worker-*.log
```
---
## Referenced Files
> The following files are referenced in this skill and included for context.
### scripts/ensure-server.sh
```bash
#!/usr/bin/env bash
set -euo pipefail
# Environment variables with defaults
CLAUDE_TEMPORAL_PID_DIR="${CLAUDE_TEMPORAL_PID_DIR:-${TMPDIR:-/tmp}/claude-temporal-pids}"
CLAUDE_TEMPORAL_LOG_DIR="${CLAUDE_TEMPORAL_LOG_DIR:-${TMPDIR:-/tmp}/claude-temporal-logs}"
TEMPORAL_CLI="${TEMPORAL_CLI:-temporal}"
TEMPORAL_ADDRESS="${TEMPORAL_ADDRESS:-localhost:7233}"
# Create directories if they don't exist
mkdir -p "$CLAUDE_TEMPORAL_PID_DIR"
mkdir -p "$CLAUDE_TEMPORAL_LOG_DIR"
PID_FILE="$CLAUDE_TEMPORAL_PID_DIR/server.pid"
LOG_FILE="$CLAUDE_TEMPORAL_LOG_DIR/server.log"
# Check if temporal CLI is installed
if ! command -v "$TEMPORAL_CLI" >/dev/null 2>&1; then
echo "❌ Temporal CLI not found: $TEMPORAL_CLI" >&2
echo "Install temporal CLI:" >&2
echo " macOS: brew install temporal" >&2
echo " Linux: https://github.com/temporalio/cli/releases" >&2
exit 1
fi
# Function to check if server is responding
check_server_connectivity() {
# Try to list namespaces as a connectivity test
if "$TEMPORAL_CLI" operator namespace list --address "$TEMPORAL_ADDRESS" >/dev/null 2>&1; then
return 0
fi
return 1
}
# Check if server is already running
if check_server_connectivity; then
echo "✓ Temporal server already running at $TEMPORAL_ADDRESS"
exit 0
fi
# Check if we have a PID file from previous run
if [[ -f "$PID_FILE" ]]; then
OLD_PID=$(cat "$PID_FILE")
if kill -0 "$OLD_PID" 2>/dev/null; then
# Process exists but not responding - might be starting up
echo "⏳ Server process exists (PID: $OLD_PID), checking connectivity..."
sleep 2
if check_server_connectivity; then
echo "✓ Temporal server ready at $TEMPORAL_ADDRESS"
exit 0
fi
echo "⚠️ Server process exists but not responding, killing and restarting..."
kill -9 "$OLD_PID" 2>/dev/null || true
fi
rm -f "$PID_FILE"
fi
# Start server in background
echo "🚀 Starting Temporal dev server..."
"$TEMPORAL_CLI" server start-dev > "$LOG_FILE" 2>&1 &
SERVER_PID=$!
echo "$SERVER_PID" > "$PID_FILE"
echo "⏳ Waiting for server to be ready..."
# Wait up to 30 seconds for server to become ready
TIMEOUT=30
ELAPSED=0
INTERVAL=1
while (( ELAPSED < TIMEOUT )); do
if check_server_connectivity; then
echo "✓ Temporal server ready at $TEMPORAL_ADDRESS (PID: $SERVER_PID)"
echo ""
echo "Web UI: http://localhost:8233"
echo "gRPC: $TEMPORAL_ADDRESS"
echo ""
echo "Server logs: $LOG_FILE"
echo "Server PID file: $PID_FILE"
exit 0
fi
# Check if process died
if ! kill -0 "$SERVER_PID" 2>/dev/null; then
echo "❌ Server process died during startup" >&2
echo "Check logs: $LOG_FILE" >&2
rm -f "$PID_FILE"
exit 2
fi
sleep "$INTERVAL"
ELAPSED=$((ELAPSED + INTERVAL))
done
echo "❌ Server startup timeout after ${TIMEOUT}s" >&2
echo "Server might still be starting. Check logs: $LOG_FILE" >&2
echo "Server PID: $SERVER_PID" >&2
exit 2
```
### scripts/ensure-worker.sh
```bash
#!/usr/bin/env bash
set -euo pipefail
# Get the directory where this script is located
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# Environment variables with defaults
CLAUDE_TEMPORAL_PID_DIR="${CLAUDE_TEMPORAL_PID_DIR:-${TMPDIR:-/tmp}/claude-temporal-pids}"
CLAUDE_TEMPORAL_LOG_DIR="${CLAUDE_TEMPORAL_LOG_DIR:-${TMPDIR:-/tmp}/claude-temporal-logs}"
CLAUDE_TEMPORAL_PROJECT_DIR="${CLAUDE_TEMPORAL_PROJECT_DIR:-$(pwd)}"
CLAUDE_TEMPORAL_PROJECT_NAME="${CLAUDE_TEMPORAL_PROJECT_NAME:-$(basename "$CLAUDE_TEMPORAL_PROJECT_DIR")}"
TEMPORAL_WORKER_CMD="${TEMPORAL_WORKER_CMD:-uv run worker}"
# Create directories if they don't exist
mkdir -p "$CLAUDE_TEMPORAL_PID_DIR"
mkdir -p "$CLAUDE_TEMPORAL_LOG_DIR"
PID_FILE="$CLAUDE_TEMPORAL_PID_DIR/worker-$CLAUDE_TEMPORAL_PROJECT_NAME.pid"
LOG_FILE="$CLAUDE_TEMPORAL_LOG_DIR/worker-$CLAUDE_TEMPORAL_PROJECT_NAME.log"
# Always kill any existing workers (both tracked and orphaned)
# This ensures we don't accumulate orphaned processes
echo "🔍 Checking for existing workers..."
# Use the helper function to find all workers
source "$SCRIPT_DIR/find-project-workers.sh"
existing_workers=$(find_project_workers "$CLAUDE_TEMPORAL_PROJECT_DIR" 2>/dev/null || true)
if [[ -n "$existing_workers" ]]; then
worker_count=$(echo "$existing_workers" | wc -l | tr -d ' ')
echo "Found $worker_count existing worker(s), stopping them..."
if "$SCRIPT_DIR/kill-worker.sh" 2>&1; then
echo "✓ Existing workers stopped"
else
# kill-worker.sh will have printed error messages
echo "⚠️ Some workers may not have been stopped, continuing anyway..."
fi
elif [[ -f "$PID_FILE" ]]; then
# PID file exists but no workers found - clean up stale PID file
echo "Removing stale PID file..."
rm -f "$PID_FILE"
fi
# Clear old log file
> "$LOG_FILE"
# Start worker in background
echo "🚀 Starting worker for project: $CLAUDE_TEMPORAL_PROJECT_NAME"
echo "Command: $TEMPORAL_WORKER_CMD"
# Start worker, redirect output to log file
eval "$TEMPORAL_WORKER_CMD" > "$LOG_FILE" 2>&1 &
WORKER_PID=$!
# Save PID
echo "$WORKER_PID" > "$PID_FILE"
echo "Worker PID: $WORKER_PID"
echo "Log file: $LOG_FILE"
# Wait for worker to be ready (simple approach: wait and check if still running)
echo "⏳ Waiting for worker to be ready..."
# Wait 10 seconds for worker to initialize
sleep 10
# Check if process is still running
if ! kill -0 "$WORKER_PID" 2>/dev/null; then
echo "❌ Worker process died during startup" >&2
echo "Last 20 lines of log:" >&2
tail -n 20 "$LOG_FILE" >&2 || true
rm -f "$PID_FILE"
exit 1
fi
# Check if log file has content (worker is producing output)
if [[ -f "$LOG_FILE" ]] && [[ -s "$LOG_FILE" ]]; then
echo "✓ Worker ready (PID: $WORKER_PID)"
echo ""
echo "To monitor worker logs:"
echo " tail -f $LOG_FILE"
echo ""
echo "To check worker health:"
echo " $SCRIPT_DIR/monitor-worker-health.sh"
exit 0
else
echo "⚠️ Worker is running but no logs detected" >&2
echo "Check logs: $LOG_FILE" >&2
exit 2
fi
```
### scripts/wait-for-workflow-status.sh
```bash
#!/usr/bin/env bash
set -euo pipefail
# Environment variables with defaults
TEMPORAL_CLI="${TEMPORAL_CLI:-temporal}"
TEMPORAL_ADDRESS="${TEMPORAL_ADDRESS:-localhost:7233}"
CLAUDE_TEMPORAL_NAMESPACE="${CLAUDE_TEMPORAL_NAMESPACE:-default}"
usage() {
cat <<'USAGE'
Usage: wait-for-workflow-status.sh --workflow-id id --status status [options]
Poll workflow for specific status.
Options:
--workflow-id workflow ID to monitor, required
--status status to wait for, required
(RUNNING, COMPLETED, FAILED, CANCELED, TERMINATED, TIMED_OUT)
--run-id specific workflow run ID (optional)
-T, --timeout seconds to wait (integer, default: 300)
-i, --interval poll interval in seconds (default: 2)
-h, --help show this help
USAGE
}
workflow_id=""
run_id=""
target_status=""
timeout=300
interval=2
while [[ $# -gt 0 ]]; do
case "$1" in
--workflow-id) workflow_id="${2-}"; shift 2 ;;
--run-id) run_id="${2-}"; shift 2 ;;
--status) target_status="${2-}"; shift 2 ;;
-T|--timeout) timeout="${2-}"; shift 2 ;;
-i|--interval) interval="${2-}"; shift 2 ;;
-h|--help) usage; exit 0 ;;
*) echo "Unknown option: $1" >&2; usage; exit 1 ;;
esac
done
if [[ -z "$workflow_id" || -z "$target_status" ]]; then
echo "workflow-id and status are required" >&2
usage
exit 1
fi
if ! [[ "$timeout" =~ ^[0-9]+$ ]]; then
echo "timeout must be an integer number of seconds" >&2
exit 1
fi
if ! command -v "$TEMPORAL_CLI" >/dev/null 2>&1; then
echo "Temporal CLI not found: $TEMPORAL_CLI" >&2
exit 1
fi
# Build temporal command
TEMPORAL_CMD=("$TEMPORAL_CLI" "workflow" "describe" "--workflow-id" "$workflow_id" "--address" "$TEMPORAL_ADDRESS" "--namespace" "$CLAUDE_TEMPORAL_NAMESPACE")
if [[ -n "$run_id" ]]; then
TEMPORAL_CMD+=("--run-id" "$run_id")
fi
# Normalize target status to uppercase
target_status=$(echo "$target_status" | tr '[:lower:]' '[:upper:]')
# End time in epoch seconds
start_epoch=$(date +%s)
deadline=$((start_epoch + timeout))
echo "Polling workflow: $workflow_id"
echo "Target status: $target_status"
echo "Timeout: ${timeout}s"
echo ""
while true; do
# Query workflow status
if output=$("${TEMPORAL_CMD[@]}" 2>&1); then
# Extract status from output
# The output includes a line like: " Status COMPLETED"
if current_status=$(echo "$output" | grep -E "^\s*Status\s" | awk '{print $2}' | tr -d ' '); then
echo "Current status: $current_status ($(date '+%H:%M:%S'))"
if [[ "$current_status" == "$target_status" ]]; then
echo ""
echo "✓ Workflow reached status: $target_status"
exit 0
fi
# Check if workflow reached a terminal state different from target
case "$current_status" in
COMPLETED|FAILED|CANCELED|TERMINATED|TIMED_OUT)
if [[ "$current_status" != "$target_status" ]]; then
echo ""
echo "⚠️ Workflow reached terminal status: $current_status (expected: $target_status)"
exit 1
fi
;;
esac
else
echo "⚠️ Could not parse workflow status from output" >&2
fi
else
echo "⚠️ Failed to query workflow (it may not exist yet)" >&2
fi
now=$(date +%s)
if (( now >= deadline )); then
echo ""
echo "❌ Timeout after ${timeout}s waiting for status: $target_status" >&2
exit 1
fi
sleep "$interval"
done
```
### scripts/get-workflow-result.sh
```bash
#!/usr/bin/env bash
set -euo pipefail
# Environment variables with defaults
TEMPORAL_CLI="${TEMPORAL_CLI:-temporal}"
TEMPORAL_ADDRESS="${TEMPORAL_ADDRESS:-localhost:7233}"
CLAUDE_TEMPORAL_NAMESPACE="${CLAUDE_TEMPORAL_NAMESPACE:-default}"
usage() {
cat <<'USAGE'
Usage: get-workflow-result.sh --workflow-id <workflow-id> [options]
Get the result/output from a completed workflow execution.
Options:
--workflow-id <id> Workflow ID to query (required)
--run-id <id> Specific run ID (optional)
--raw Output raw JSON result only
-h, --help Show this help
Examples:
# Get workflow result with formatted output
./tools/get-workflow-result.sh --workflow-id my-workflow-123
# Get raw JSON result only
./tools/get-workflow-result.sh --workflow-id my-workflow-123 --raw
# Get result for specific run
./tools/get-workflow-result.sh --workflow-id my-workflow-123 --run-id abc-def-ghi
Output:
- Workflow status (COMPLETED, FAILED, etc.)
- Workflow result/output (if completed successfully)
- Failure message (if failed)
- Termination reason (if terminated)
USAGE
}
workflow_id=""
run_id=""
raw_mode=false
while [[ $# -gt 0 ]]; do
case "$1" in
--workflow-id) workflow_id="${2-}"; shift 2 ;;
--run-id) run_id="${2-}"; shift 2 ;;
--raw) raw_mode=true; shift ;;
-h|--help) usage; exit 0 ;;
*) echo "Unknown option: $1" >&2; usage; exit 1 ;;
esac
done
if [[ -z "$workflow_id" ]]; then
echo "Error: --workflow-id is required" >&2
usage
exit 1
fi
if ! command -v "$TEMPORAL_CLI" >/dev/null 2>&1; then
echo "Temporal CLI not found: $TEMPORAL_CLI" >&2
exit 1
fi
# Build describe command
DESCRIBE_CMD=("$TEMPORAL_CLI" "workflow" "describe" "--workflow-id" "$workflow_id" "--address" "$TEMPORAL_ADDRESS" "--namespace" "$CLAUDE_TEMPORAL_NAMESPACE")
if [[ -n "$run_id" ]]; then
DESCRIBE_CMD+=("--run-id" "$run_id")
fi
# Get workflow details
if ! describe_output=$("${DESCRIBE_CMD[@]}" 2>&1); then
echo "Failed to describe workflow: $workflow_id" >&2
echo "$describe_output" >&2
exit 1
fi
# Extract workflow status
status=$(echo "$describe_output" | grep -i "Status:" | head -n1 | awk '{print $2}' || echo "UNKNOWN")
if [[ "$raw_mode" == true ]]; then
# Raw mode: just output the result payload
# Use 'temporal workflow show' to get execution history with result
if ! show_output=$("$TEMPORAL_CLI" workflow show --workflow-id "$workflow_id" --address "$TEMPORAL_ADDRESS" --namespace "$CLAUDE_TEMPORAL_NAMESPACE" --fields long 2>&1); then
echo "Failed to get workflow result" >&2
exit 1
fi
# Extract result from WorkflowExecutionCompleted event
echo "$show_output" | grep -A 10 "WorkflowExecutionCompleted" | grep -E "result|Result" || echo "{}"
exit 0
fi
# Formatted output
echo "════════════════════════════════════════════════════════════"
echo "Workflow: $workflow_id"
echo "Status: $status"
echo "════════════════════════════════════════════════════════════"
echo ""
case "$status" in
COMPLETED)
echo "✅ Workflow completed successfully"
echo ""
echo "Result:"
echo "────────────────────────────────────────────────────────────"
# Get workflow result using 'show' command
if show_output=$("$TEMPORAL_CLI" workflow show --workflow-id "$workflow_id" --address "$TEMPORAL_ADDRESS" --namespace "$CLAUDE_TEMPORAL_NAMESPACE" --fields long 2>/dev/null); then
# Extract result from WorkflowExecutionCompleted event
result=$(echo "$show_output" | grep -A 20 "WorkflowExecutionCompleted" | grep -E "result|Result" || echo "")
if [[ -n "$result" ]]; then
echo "$result"
else
echo "(No result payload - workflow may return None/void)"
fi
else
echo "(Unable to extract result)"
fi
;;
FAILED)
echo "❌ Workflow failed"
echo ""
echo "Failure details:"
echo "────────────────────────────────────────────────────────────"
# Extract failure message
failure=$(echo "$describe_output" | grep -A 5 "Failure:" || echo "")
if [[ -n "$failure" ]]; then
echo "$failure"
else
echo "(No failure details available)"
fi
echo ""
echo "To analyze error:"
echo " ./tools/analyze-workflow-error.sh --workflow-id $workflow_id"
;;
CANCELED)
echo "🚫 Workflow was canceled"
echo ""
# Try to extract cancellation reason
cancel_info=$(echo "$describe_output" | grep -i "cancel" || echo "")
if [[ -n "$cancel_info" ]]; then
echo "Cancellation info:"
echo "$cancel_info"
fi
;;
TERMINATED)
echo "⛔ Workflow was terminated"
echo ""
# Extract termination reason
term_reason=$(echo "$describe_output" | grep -i "reason:" | head -n1 || echo "")
if [[ -n "$term_reason" ]]; then
echo "Termination reason:"
echo "$term_reason"
fi
;;
TIMED_OUT)
echo "⏱️ Workflow timed out"
echo ""
timeout_info=$(echo "$describe_output" | grep -i "timeout" || echo "")
if [[ -n "$timeout_info" ]]; then
echo "Timeout info:"
echo "$timeout_info"
fi
;;
RUNNING)
echo "🏃 Workflow is still running"
echo ""
echo "Cannot get result for running workflow."
echo ""
echo "To wait for completion:"
echo " ./tools/wait-for-workflow-status.sh --workflow-id $workflow_id --status COMPLETED"
exit 1
;;
*)
echo "Status: $status"
echo ""
echo "Full workflow details:"
echo "$describe_output"
;;
esac
echo ""
echo "════════════════════════════════════════════════════════════"
echo ""
echo "To view full workflow history:"
echo " temporal workflow show --workflow-id $workflow_id"
echo ""
echo "To view in Web UI:"
echo " http://localhost:8233/namespaces/$CLAUDE_TEMPORAL_NAMESPACE/workflows/$workflow_id"
```
### scripts/kill-worker.sh
```bash
#!/usr/bin/env bash
set -euo pipefail
# Get the directory where this script is located
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# Source the helper function to find project workers
source "$SCRIPT_DIR/find-project-workers.sh"
# Environment variables with defaults
CLAUDE_TEMPORAL_PID_DIR="${CLAUDE_TEMPORAL_PID_DIR:-${TMPDIR:-/tmp}/claude-temporal-pids}"
CLAUDE_TEMPORAL_PROJECT_NAME="${CLAUDE_TEMPORAL_PROJECT_NAME:-$(basename "$(pwd)")}"
CLAUDE_TEMPORAL_PROJECT_DIR="${CLAUDE_TEMPORAL_PROJECT_DIR:-$(pwd)}"
PID_FILE="$CLAUDE_TEMPORAL_PID_DIR/worker-$CLAUDE_TEMPORAL_PROJECT_NAME.pid"
# Graceful shutdown timeout (seconds)
GRACEFUL_TIMEOUT=5
# Find ALL workers for this project (both tracked and orphaned)
echo "🔍 Finding all workers for project: $CLAUDE_TEMPORAL_PROJECT_NAME"
# Collect all PIDs
worker_pids=()
# Add PID from file if it exists
if [[ -f "$PID_FILE" ]]; then
TRACKED_PID=$(cat "$PID_FILE")
if kill -0 "$TRACKED_PID" 2>/dev/null; then
worker_pids+=("$TRACKED_PID")
fi
fi
# Find all workers for this project using the helper function
while IFS= read -r pid; do
[[ -n "$pid" ]] && worker_pids+=("$pid")
done < <(find_project_workers "$CLAUDE_TEMPORAL_PROJECT_DIR" 2>/dev/null || true)
# Remove duplicates
worker_pids=($(printf "%s\n" "${worker_pids[@]}" | sort -u))
if [[ ${#worker_pids[@]} -eq 0 ]]; then
echo "No workers running for project: $CLAUDE_TEMPORAL_PROJECT_NAME"
rm -f "$PID_FILE"
exit 1
fi
echo "Found ${#worker_pids[@]} worker process(es): ${worker_pids[*]}"
# Attempt graceful shutdown of all workers
echo "⏳ Attempting graceful shutdown..."
for pid in "${worker_pids[@]}"; do
kill -TERM "$pid" 2>/dev/null || true
done
# Wait for graceful shutdown
ELAPSED=0
while (( ELAPSED < GRACEFUL_TIMEOUT )); do
all_dead=true
for pid in "${worker_pids[@]}"; do
if kill -0 "$pid" 2>/dev/null; then
all_dead=false
break
fi
done
if [[ "$all_dead" == true ]]; then
echo "✓ All workers stopped gracefully"
rm -f "$PID_FILE"
exit 0
fi
sleep 1
ELAPSED=$((ELAPSED + 1))
done
# Force kill any still running
still_running=()
for pid in "${worker_pids[@]}"; do
if kill -0 "$pid" 2>/dev/null; then
still_running+=("$pid")
fi
done
if [[ ${#still_running[@]} -gt 0 ]]; then
echo "⚠️ ${#still_running[@]} process(es) still running after ${GRACEFUL_TIMEOUT}s, forcing kill..."
for pid in "${still_running[@]}"; do
kill -9 "$pid" 2>/dev/null || true
done
sleep 1
# Verify all are dead
failed_pids=()
for pid in "${still_running[@]}"; do
if kill -0 "$pid" 2>/dev/null; then
failed_pids+=("$pid")
fi
done
if [[ ${#failed_pids[@]} -gt 0 ]]; then
echo "❌ Failed to kill worker process(es): ${failed_pids[*]}" >&2
exit 1
fi
fi
echo "✓ All workers killed (${#worker_pids[@]} process(es))"
rm -f "$PID_FILE"
exit 0
```
### scripts/list-workers.sh
```bash
#!/usr/bin/env bash
set -euo pipefail
# Get the directory where this script is located
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# Source the helper function to find project workers
source "$SCRIPT_DIR/find-project-workers.sh"
# Environment variables with defaults
CLAUDE_TEMPORAL_PID_DIR="${CLAUDE_TEMPORAL_PID_DIR:-${TMPDIR:-/tmp}/claude-temporal-pids}"
# Check if PID directory exists
if [[ ! -d "$CLAUDE_TEMPORAL_PID_DIR" ]]; then
echo "No PID directory found: $CLAUDE_TEMPORAL_PID_DIR"
exit 0
fi
# Function to get process uptime
get_uptime() {
local pid=$1
if [[ "$(uname)" == "Darwin" ]]; then
# macOS
local start_time=$(ps -o lstart= -p "$pid" 2>/dev/null | xargs -I{} date -j -f "%c" "{}" "+%s" 2>/dev/null || echo "0")
else
# Linux
local start_time=$(ps -o etimes= -p "$pid" 2>/dev/null | tr -d ' ' || echo "0")
fi
if [[ "$start_time" == "0" ]]; then
echo "-"
return
fi
local now=$(date +%s)
local elapsed=$((now - start_time))
# For Linux, etimes already gives elapsed seconds
if [[ "$(uname)" != "Darwin" ]]; then
elapsed=$start_time
fi
local hours=$((elapsed / 3600))
local minutes=$(((elapsed % 3600) / 60))
local seconds=$((elapsed % 60))
if (( hours > 0 )); then
printf "%dh %dm" "$hours" "$minutes"
elif (( minutes > 0 )); then
printf "%dm %ds" "$minutes" "$seconds"
else
printf "%ds" "$seconds"
fi
}
# Function to get process command
get_command() {
local pid=$1
ps -o command= -p "$pid" 2>/dev/null | cut -c1-50 || echo "-"
}
# Print header
printf "%-20s %-8s %-10s %-10s %-10s %s\n" "PROJECT" "PID" "STATUS" "TRACKED" "UPTIME" "COMMAND"
printf "%-20s %-8s %-10s %-10s %-10s %s\n" "--------------------" "--------" "----------" "----------" "----------" "-----"
# Find all PID files
found_any=false
# Track all PIDs we've seen to detect orphans later
declare -A tracked_pids
# List server if exists
SERVER_PID_FILE="$CLAUDE_TEMPORAL_PID_DIR/server.pid"
if [[ -f "$SERVER_PID_FILE" ]]; then
found_any=true
SERVER_PID=$(cat "$SERVER_PID_FILE")
tracked_pids[$SERVER_PID]=1
if kill -0 "$SERVER_PID" 2>/dev/null; then
uptime=$(get_uptime "$SERVER_PID")
command=$(get_command "$SERVER_PID")
printf "%-20s %-8s %-10s %-10s %-10s %s\n" "server" "$SERVER_PID" "running" "yes" "$uptime" "$command"
else
printf "%-20s %-8s %-10s %-10s %-10s %s\n" "server" "$SERVER_PID" "dead" "yes" "-" "-"
fi
fi
# List all worker PID files
shopt -s nullglob
PID_FILES=("$CLAUDE_TEMPORAL_PID_DIR"/worker-*.pid)
shopt -u nullglob
# Store project directories for orphan detection
declare -A project_dirs
for pid_file in "${PID_FILES[@]}"; do
found_any=true
# Extract project name from filename
filename=$(basename "$pid_file")
project="${filename#worker-}"
project="${project%.pid}"
# Read PID
worker_pid=$(cat "$pid_file")
tracked_pids[$worker_pid]=1
# Check if process is running
if kill -0 "$worker_pid" 2>/dev/null; then
uptime=$(get_uptime "$worker_pid")
command=$(get_command "$worker_pid")
printf "%-20s %-8s %-10s %-10s %-10s %s\n" "$project" "$worker_pid" "running" "yes" "$uptime" "$command"
# Try to determine project directory from the command
# Look for project directory in the command path
if [[ "$command" =~ ([^[:space:]]+/${project})[/[:space:]] ]]; then
project_dir="${BASH_REMATCH[1]}"
project_dirs[$project]="$project_dir"
fi
else
printf "%-20s %-8s %-10s %-10s %-10s %s\n" "$project" "$worker_pid" "dead" "yes" "-" "-"
fi
done
# Now check for orphaned workers for each project we know about
for project in "${!project_dirs[@]}"; do
project_dir="${project_dirs[$project]}"
# Find all workers for this project
while IFS= read -r pid; do
[[ -z "$pid" ]] && continue
# Skip if we already tracked this PID
if [[ -n "${tracked_pids[$pid]:-}" ]]; then
continue
fi
# This is an orphaned worker
found_any=true
if kill -0 "$pid" 2>/dev/null; then
uptime=$(get_uptime "$pid")
command=$(get_command "$pid")
printf "%-20s %-8s %-10s %-10s %-10s %s\n" "$project" "$pid" "running" "ORPHAN" "$uptime" "$command"
tracked_pids[$pid]=1
fi
done < <(find_project_workers "$project_dir" 2>/dev/null || true)
done
# Also scan for workers that have no PID file at all (completely orphaned)
# Find all Python worker processes and group by project
if [[ "$(uname)" == "Darwin" ]]; then
# macOS
while IFS= read -r line; do
[[ -z "$line" ]] && continue
pid=$(echo "$line" | awk '{print $1}')
command=$(echo "$line" | cut -d' ' -f2-)
# Skip if already tracked
if [[ -n "${tracked_pids[$pid]:-}" ]]; then
continue
fi
# Extract project name from path
if [[ "$command" =~ /([^/]+)/\.venv/bin/ ]]; then
project="${BASH_REMATCH[1]}"
found_any=true
uptime=$(get_uptime "$pid")
cmd_display=$(echo "$command" | cut -c1-50)
printf "%-20s %-8s %-10s %-10s %-10s %s\n" "$project" "$pid" "running" "ORPHAN" "$uptime" "$cmd_display"
tracked_pids[$pid]=1
fi
done < <(ps ax -o pid,command | grep -E "\.venv/bin/(python[0-9.]*|worker)" | grep -v grep)
# Also check for "uv run worker" processes
while IFS= read -r line; do
[[ -z "$line" ]] && continue
pid=$(echo "$line" | awk '{print $1}')
# Skip if already tracked
if [[ -n "${tracked_pids[$pid]:-}" ]]; then
continue
fi
# Get the working directory for this process
cwd=$(lsof -a -p "$pid" -d cwd -Fn 2>/dev/null | grep ^n | cut -c2- || echo "")
if [[ -n "$cwd" ]]; then
project=$(basename "$cwd")
found_any=true
uptime=$(get_uptime "$pid")
printf "%-20s %-8s %-10s %-10s %-10s %s\n" "$project" "$pid" "running" "ORPHAN" "$uptime" "uv run worker"
tracked_pids[$pid]=1
fi
done < <(ps ax -o pid,command | grep "uv run worker" | grep -v grep | awk '{print $1}')
else
# Linux - similar logic using /proc
while IFS= read -r line; do
[[ -z "$line" ]] && continue
pid=$(echo "$line" | awk '{print $1}')
command=$(echo "$line" | cut -d' ' -f2-)
# Skip if already tracked
if [[ -n "${tracked_pids[$pid]:-}" ]]; then
continue
fi
# Extract project name from path
if [[ "$command" =~ /([^/]+)/\.venv/bin/ ]]; then
project="${BASH_REMATCH[1]}"
found_any=true
uptime=$(get_uptime "$pid")
cmd_display=$(echo "$command" | cut -c1-50)
printf "%-20s %-8s %-10s %-10s %-10s %s\n" "$project" "$pid" "running" "ORPHAN" "$uptime" "$cmd_display"
tracked_pids[$pid]=1
fi
done < <(ps ax -o pid,cmd | grep -E "\.venv/bin/(python[0-9.]*|worker)" | grep -v grep)
fi
if [[ "$found_any" == false ]]; then
echo "No workers or server found"
fi
```
### scripts/monitor-worker-health.sh
```bash
#!/usr/bin/env bash
set -euo pipefail
# Environment variables with defaults
CLAUDE_TEMPORAL_PID_DIR="${CLAUDE_TEMPORAL_PID_DIR:-${TMPDIR:-/tmp}/claude-temporal-pids}"
CLAUDE_TEMPORAL_LOG_DIR="${CLAUDE_TEMPORAL_LOG_DIR:-${TMPDIR:-/tmp}/claude-temporal-logs}"
CLAUDE_TEMPORAL_PROJECT_NAME="${CLAUDE_TEMPORAL_PROJECT_NAME:-$(basename "$(pwd)")}"
PID_FILE="$CLAUDE_TEMPORAL_PID_DIR/worker-$CLAUDE_TEMPORAL_PROJECT_NAME.pid"
LOG_FILE="$CLAUDE_TEMPORAL_LOG_DIR/worker-$CLAUDE_TEMPORAL_PROJECT_NAME.log"
# Function to get process uptime
get_uptime() {
local pid=$1
if [[ "$(uname)" == "Darwin" ]]; then
# macOS
local start_time=$(ps -o lstart= -p "$pid" 2>/dev/null | xargs -I{} date -j -f "%c" "{}" "+%s" 2>/dev/null || echo "0")
else
# Linux
local start_time=$(ps -o etimes= -p "$pid" 2>/dev/null | tr -d ' ' || echo "0")
fi
if [[ "$start_time" == "0" ]]; then
echo "unknown"
return
fi
local now=$(date +%s)
local elapsed=$((now - start_time))
# For Linux, etimes already gives elapsed seconds
if [[ "$(uname)" != "Darwin" ]]; then
elapsed=$start_time
fi
local hours=$((elapsed / 3600))
local minutes=$(((elapsed % 3600) / 60))
local seconds=$((elapsed % 60))
if (( hours > 0 )); then
printf "%dh %dm %ds" "$hours" "$minutes" "$seconds"
elif (( minutes > 0 )); then
printf "%dm %ds" "$minutes" "$seconds"
else
printf "%ds" "$seconds"
fi
}
echo "=== Worker Health Check ==="
echo "Project: $CLAUDE_TEMPORAL_PROJECT_NAME"
echo ""
# Check if PID file exists
if [[ ! -f "$PID_FILE" ]]; then
echo "Worker Status: NOT RUNNING"
echo "No PID file found: $PID_FILE"
exit 1
fi
# Read PID
WORKER_PID=$(cat "$PID_FILE")
# Check if process is alive
if ! kill -0 "$WORKER_PID" 2>/dev/null; then
echo "Worker Status: DEAD"
echo "PID file exists but process is not running"
echo "PID: $WORKER_PID (stale)"
echo ""
echo "To clean up and restart:"
echo " rm -f $PID_FILE"
echo " ./tools/ensure-worker.sh"
exit 1
fi
# Process is alive
echo "Worker Status: RUNNING"
echo "PID: $WORKER_PID"
echo "Uptime: $(get_uptime "$WORKER_PID")"
echo ""
# Check log file
if [[ -f "$LOG_FILE" ]]; then
echo "Log file: $LOG_FILE"
echo "Log size: $(wc -c < "$LOG_FILE" | tr -d ' ') bytes"
echo ""
# Check for recent errors in logs (last 50 lines)
if tail -n 50 "$LOG_FILE" | grep -iE "(error|exception|fatal|traceback)" >/dev/null 2>&1; then
echo "⚠️ Recent errors found in logs (last 50 lines):"
echo ""
tail -n 50 "$LOG_FILE" | grep -iE "(error|exception|fatal)" | tail -n 10
echo ""
echo "Full logs: $LOG_FILE"
exit 1
fi
# Show last log entry
echo "Last log entry:"
tail -n 1 "$LOG_FILE" 2>/dev/null || echo "(empty log)"
echo ""
echo "✓ Worker appears healthy"
echo ""
echo "To view logs:"
echo " tail -f $LOG_FILE"
else
echo "⚠️ Log file not found: $LOG_FILE"
echo ""
echo "Worker is running but no logs found"
exit 1
fi
```
### scripts/kill-all-workers.sh
```bash
#!/usr/bin/env bash
set -euo pipefail
# Environment variables with defaults
CLAUDE_TEMPORAL_PID_DIR="${CLAUDE_TEMPORAL_PID_DIR:-${TMPDIR:-/tmp}/claude-temporal-pids}"
# Graceful shutdown timeout (seconds)
GRACEFUL_TIMEOUT=5
usage() {
cat <<'USAGE'
Usage: kill-all-workers.sh [options]
Kill all tracked workers across all projects.
Options:
-p, --project kill only specific project worker
--include-server also kill temporal dev server
-h, --help show this help
USAGE
}
specific_project=""
include_server=false
while [[ $# -gt 0 ]]; do
case "$1" in
-p|--project) specific_project="${2-}"; shift 2 ;;
--include-server) include_server=true; shift ;;
-h|--help) usage; exit 0 ;;
*) echo "Unknown option: $1" >&2; usage; exit 1 ;;
esac
done
# Check if PID directory exists
if [[ ! -d "$CLAUDE_TEMPORAL_PID_DIR" ]]; then
echo "No PID directory found: $CLAUDE_TEMPORAL_PID_DIR"
exit 0
fi
# Function to kill a process gracefully then forcefully
kill_process() {
local pid=$1
local name=$2
if ! kill -0 "$pid" 2>/dev/null; then
echo "$name (PID $pid): already dead"
return 0
fi
# Attempt graceful shutdown
kill -TERM "$pid" 2>/dev/null || true
# Wait for graceful shutdown
local elapsed=0
while (( elapsed < GRACEFUL_TIMEOUT )); do
if ! kill -0 "$pid" 2>/dev/null; then
echo "$name (PID $pid): stopped gracefully ✓"
return 0
fi
sleep 1
elapsed=$((elapsed + 1))
done
# Force kill if still running
if kill -0 "$pid" 2>/dev/null; then
kill -9 "$pid" 2>/dev/null || true
sleep 1
if kill -0 "$pid" 2>/dev/null; then
echo "$name (PID $pid): failed to kill ❌" >&2
return 1
fi
echo "$name (PID $pid): force killed ✓"
fi
return 0
}
killed_count=0
# Kill specific project worker if requested
if [[ -n "$specific_project" ]]; then
PID_FILE="$CLAUDE_TEMPORAL_PID_DIR/worker-$specific_project.pid"
if [[ -f "$PID_FILE" ]]; then
WORKER_PID=$(cat "$PID_FILE")
if kill_process "$WORKER_PID" "worker-$specific_project"; then
rm -f "$PID_FILE"
killed_count=$((killed_count + 1))
fi
else
echo "No worker found for project: $specific_project"
exit 1
fi
else
# Kill all workers
shopt -s nullglob
PID_FILES=("$CLAUDE_TEMPORAL_PID_DIR"/worker-*.pid)
shopt -u nullglob
for pid_file in "${PID_FILES[@]}"; do
# Extract project name from filename
filename=$(basename "$pid_file")
project="${filename#worker-}"
project="${project%.pid}"
# Read PID
worker_pid=$(cat "$pid_file")
if kill_process "$worker_pid" "worker-$project"; then
rm -f "$pid_file"
killed_count=$((killed_count + 1))
fi
done
fi
# Kill server if requested
if [[ "$include_server" == true ]]; then
SERVER_PID_FILE="$CLAUDE_TEMPORAL_PID_DIR/server.pid"
if [[ -f "$SERVER_PID_FILE" ]]; then
SERVER_PID=$(cat "$SERVER_PID_FILE")
if kill_process "$SERVER_PID" "server"; then
rm -f "$SERVER_PID_FILE"
killed_count=$((killed_count + 1))
fi
fi
fi
if [[ "$killed_count" -eq 0 ]]; then
echo "No processes to kill"
else
echo ""
echo "Total: $killed_count process(es) killed"
fi
```
### scripts/find-stalled-workflows.sh
```bash
#!/usr/bin/env bash
set -euo pipefail
# Environment variables with defaults
TEMPORAL_CLI="${TEMPORAL_CLI:-temporal}"
TEMPORAL_ADDRESS="${TEMPORAL_ADDRESS:-localhost:7233}"
CLAUDE_TEMPORAL_NAMESPACE="${CLAUDE_TEMPORAL_NAMESPACE:-default}"
usage() {
cat <<'USAGE'
Usage: find-stalled-workflows.sh [options]
Detect workflows with systematic issues (e.g., workflow task failures).
Options:
--query filter workflows by query (optional)
-h, --help show this help
USAGE
}
query=""
while [[ $# -gt 0 ]]; do
case "$1" in
--query) query="${2-}"; shift 2 ;;
-h|--help) usage; exit 0 ;;
*) echo "Unknown option: $1" >&2; usage; exit 1 ;;
esac
done
if ! command -v "$TEMPORAL_CLI" >/dev/null 2>&1; then
echo "Temporal CLI not found: $TEMPORAL_CLI" >&2
exit 1
fi
if ! command -v jq >/dev/null 2>&1; then
echo "⚠️ jq not found. Install jq for better output formatting." >&2
echo "This script will continue with basic text parsing..." >&2
fi
# Build list command - only look for RUNNING workflows since stalled workflows must be running
LIST_CMD=("$TEMPORAL_CLI" "workflow" "list" "--address" "$TEMPORAL_ADDRESS" "--namespace" "$CLAUDE_TEMPORAL_NAMESPACE")
if [[ -n "$query" ]]; then
# Append user query to running filter
LIST_CMD+=("--query" "ExecutionStatus='Running' AND ($query)")
else
# Default: only find running workflows
LIST_CMD+=("--query" "ExecutionStatus='Running'")
fi
echo "Scanning for stalled workflows..."
echo ""
# Get list of running workflows
if ! workflow_list=$("${LIST_CMD[@]}" 2>&1); then
echo "Failed to list workflows" >&2
echo "$workflow_list" >&2
exit 1
fi
# Parse workflow IDs from list
# The output format is: Status WorkflowId Type StartTime
# WorkflowId is in column 2
workflow_ids=$(echo "$workflow_list" | awk 'NR>1 {print $2}' | grep -v "^-" | grep -v "^$" || true)
if [[ -z "$workflow_ids" ]]; then
echo "No workflows found"
exit 0
fi
# Print header
printf "%-40s %-35s %-10s\n" "WORKFLOW_ID" "ERROR_TYPE" "ATTEMPTS"
printf "%-40s %-35s %-10s\n" "----------------------------------------" "-----------------------------------" "----------"
found_stalled=false
# Check each workflow for errors
while IFS= read -r workflow_id; do
[[ -z "$workflow_id" ]] && continue
# Get workflow event history using 'show' to see failure events
if show_output=$("$TEMPORAL_CLI" workflow show --workflow-id "$workflow_id" --address "$TEMPORAL_ADDRESS" --namespace "$CLAUDE_TEMPORAL_NAMESPACE" 2>/dev/null); then
# Check for workflow task failures
workflow_task_failures=$(echo "$show_output" | grep -c "WorkflowTaskFailed" 2>/dev/null || echo "0")
workflow_task_failures=$(echo "$workflow_task_failures" | tr -d '\n' | tr -d ' ')
activity_task_failures=$(echo "$show_output" | grep -c "ActivityTaskFailed" 2>/dev/null || echo "0")
activity_task_failures=$(echo "$activity_task_failures" | tr -d '\n' | tr -d ' ')
# Report if significant failures found
if [[ "$workflow_task_failures" -gt 0 ]]; then
found_stalled=true
# Truncate long workflow IDs for display
display_id=$(echo "$workflow_id" | cut -c1-40)
printf "%-40s %-35s %-10s\n" "$display_id" "WorkflowTaskFailed" "$workflow_task_failures"
elif [[ "$activity_task_failures" -gt 2 ]]; then
# Only report activity failures if they're excessive (>2)
found_stalled=true
display_id=$(echo "$workflow_id" | cut -c1-40)
printf "%-40s %-35s %-10s\n" "$display_id" "ActivityTaskFailed" "$activity_task_failures"
fi
fi
done <<< "$workflow_ids"
echo ""
if [[ "$found_stalled" == false ]]; then
echo "No stalled workflows detected"
else
echo "Found stalled workflows. To investigate:"
echo " ./tools/analyze-workflow-error.sh --workflow-id <workflow-id>"
echo ""
echo "To cancel all stalled workflows:"
echo " ./tools/find-stalled-workflows.sh | awk 'NR>2 {print \$1}' > stalled.txt"
echo " ./tools/bulk-cancel-workflows.sh --workflow-ids stalled.txt"
fi
```
### scripts/analyze-workflow-error.sh
```bash
#!/usr/bin/env bash
set -euo pipefail
# Get the directory where this script is located
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# Environment variables with defaults
TEMPORAL_CLI="${TEMPORAL_CLI:-temporal}"
TEMPORAL_ADDRESS="${TEMPORAL_ADDRESS:-localhost:7233}"
CLAUDE_TEMPORAL_NAMESPACE="${CLAUDE_TEMPORAL_NAMESPACE:-default}"
usage() {
cat <<'USAGE'
Usage: analyze-workflow-error.sh --workflow-id id [options]
Parse workflow history to extract error details and provide recommendations.
Options:
--workflow-id workflow ID to analyze, required
--run-id specific workflow run ID (optional)
-h, --help show this help
USAGE
}
workflow_id=""
run_id=""
while [[ $# -gt 0 ]]; do
case "$1" in
--workflow-id) workflow_id="${2-}"; shift 2 ;;
--run-id) run_id="${2-}"; shift 2 ;;
-h|--help) usage; exit 0 ;;
*) echo "Unknown option: $1" >&2; usage; exit 1 ;;
esac
done
if [[ -z "$workflow_id" ]]; then
echo "workflow-id is required" >&2
usage
exit 1
fi
if ! command -v "$TEMPORAL_CLI" >/dev/null 2>&1; then
echo "Temporal CLI not found: $TEMPORAL_CLI" >&2
exit 1
fi
# Build temporal command
DESCRIBE_CMD=("$TEMPORAL_CLI" "workflow" "describe" "--workflow-id" "$workflow_id" "--address" "$TEMPORAL_ADDRESS" "--namespace" "$CLAUDE_TEMPORAL_NAMESPACE")
SHOW_CMD=("$TEMPORAL_CLI" "workflow" "show" "--workflow-id" "$workflow_id" "--address" "$TEMPORAL_ADDRESS" "--namespace" "$CLAUDE_TEMPORAL_NAMESPACE")
if [[ -n "$run_id" ]]; then
DESCRIBE_CMD+=("--run-id" "$run_id")
SHOW_CMD+=("--run-id" "$run_id")
fi
echo "=== Workflow Error Analysis ==="
echo "Workflow ID: $workflow_id"
if [[ -n "$run_id" ]]; then
echo "Run ID: $run_id"
fi
echo ""
# Get workflow description
if ! describe_output=$("${DESCRIBE_CMD[@]}" 2>&1); then
echo "❌ Failed to describe workflow" >&2
echo "$describe_output" >&2
exit 1
fi
# Get workflow history
if ! show_output=$("${SHOW_CMD[@]}" 2>&1); then
echo "❌ Failed to get workflow history" >&2
echo "$show_output" >&2
exit 1
fi
# Extract status
status=$(echo "$describe_output" | grep -E "^\s*Status:" | awk '{print $2}' | tr -d ' ' || echo "UNKNOWN")
echo "Current Status: $status"
echo ""
# Analyze different error types
workflow_task_failures=$(echo "$show_output" | grep -c "WorkflowTaskFailed" || echo "0")
activity_task_failures=$(echo "$show_output" | grep -c "ActivityTaskFailed" || echo "0")
workflow_exec_failed=$(echo "$show_output" | grep -c "WorkflowExecutionFailed" || echo "0")
# Report findings
if [[ "$workflow_task_failures" -gt 0 ]]; then
echo "=== WorkflowTaskFailed Detected ==="
echo "Attempts: $workflow_task_failures"
echo ""
# Extract error details
echo "Error Details:"
echo "$show_output" | grep -A 10 "WorkflowTaskFailed" | head -n 15
echo ""
echo "=== Diagnosis ==="
echo "Error Type: WorkflowTaskFailed"
echo ""
echo "Common Causes:"
echo " 1. Workflow type not registered with worker"
echo " 2. Worker missing workflow definition"
echo " 3. Workflow code has syntax errors"
echo " 4. Worker not running or not polling correct task queue"
echo ""
echo "=== Recommended Actions ==="
echo "1. Check if worker is running:"
echo " $SCRIPT_DIR/list-workers.sh"
echo ""
echo "2. Verify workflow is registered in worker.py:"
echo " - Check workflows=[YourWorkflow] in Worker() constructor"
echo ""
echo "3. Restart worker with updated code:"
echo " $SCRIPT_DIR/ensure-worker.sh"
echo ""
echo "4. Check worker logs for errors:"
echo " tail -f \$CLAUDE_TEMPORAL_LOG_DIR/worker-\$(basename \"\$(pwd)\").log"
elif [[ "$activity_task_failures" -gt 0 ]]; then
echo "=== ActivityTaskFailed Detected ==="
echo "Attempts: $activity_task_failures"
echo ""
# Extract error details
echo "Error Details:"
echo "$show_output" | grep -A 10 "ActivityTaskFailed" | head -n 20
echo ""
echo "=== Diagnosis ==="
echo "Error Type: ActivityTaskFailed"
echo ""
echo "Common Causes:"
echo " 1. Activity code threw an exception"
echo " 2. Activity type not registered with worker"
echo " 3. Activity code has bugs"
echo " 4. External dependency failure (API, database, etc.)"
echo ""
echo "=== Recommended Actions ==="
echo "1. Check activity logs for stack traces:"
echo " tail -f \$CLAUDE_TEMPORAL_LOG_DIR/worker-\$(basename \"\$(pwd)\").log"
echo ""
echo "2. Verify activity is registered in worker.py:"
echo " - Check activities=[your_activity] in Worker() constructor"
echo ""
echo "3. Review activity code for errors"
echo ""
echo "4. If activity code is fixed, restart worker:"
echo " $SCRIPT_DIR/ensure-worker.sh"
echo ""
echo "5. Consider adjusting retry policy if transient failure"
elif [[ "$workflow_exec_failed" -gt 0 ]]; then
echo "=== WorkflowExecutionFailed Detected ==="
echo ""
# Extract error details
echo "Error Details:"
echo "$show_output" | grep -A 20 "WorkflowExecutionFailed" | head -n 25
echo ""
echo "=== Diagnosis ==="
echo "Error Type: WorkflowExecutionFailed"
echo ""
echo "Common Causes:"
echo " 1. Workflow business logic error"
echo " 2. Unhandled exception in workflow code"
echo " 3. Workflow determinism violation"
echo ""
echo "=== Recommended Actions ==="
echo "1. Review workflow code for logic errors"
echo ""
echo "2. Check for non-deterministic code:"
echo " - Random number generation"
echo " - System time calls"
echo " - Threading/concurrency"
echo ""
echo "3. Review full workflow history:"
echo " temporal workflow show --workflow-id $workflow_id"
echo ""
echo "4. After fixing code, restart worker:"
echo " $SCRIPT_DIR/ensure-worker.sh"
elif [[ "$status" == "TIMED_OUT" ]]; then
echo "=== Workflow Timeout ==="
echo ""
echo "The workflow exceeded its timeout limit."
echo ""
echo "=== Recommended Actions ==="
echo "1. Review workflow timeout settings in starter code"
echo ""
echo "2. Check if activities are taking too long:"
echo " - Review activity timeout settings"
echo " - Check activity logs for performance issues"
echo ""
echo "3. Consider increasing timeouts if operations legitimately take longer"
elif [[ "$status" == "RUNNING" ]]; then
echo "=== Workflow Still Running ==="
echo ""
echo "The workflow appears to be running normally."
echo ""
echo "To monitor progress:"
echo " temporal workflow show --workflow-id $workflow_id"
echo ""
echo "To wait for completion:"
echo " $SCRIPT_DIR/wait-for-workflow-status.sh --workflow-id $workflow_id --status COMPLETED"
elif [[ "$status" == "COMPLETED" ]]; then
echo "=== Workflow Completed Successfully ==="
echo ""
echo "No errors detected. Workflow completed normally."
else
echo "=== Status: $status ==="
echo ""
echo "Review full workflow details:"
echo " temporal workflow describe --workflow-id $workflow_id"
echo " temporal workflow show --workflow-id $workflow_id"
fi
echo ""
echo "=== Additional Resources ==="
echo "Web UI: http://localhost:8233/namespaces/$CLAUDE_TEMPORAL_NAMESPACE/workflows/$workflow_id"
```
### scripts/list-recent-workflows.sh
```bash
#!/usr/bin/env bash
set -euo pipefail
# Environment variables with defaults
TEMPORAL_CLI="${TEMPORAL_CLI:-temporal}"
TEMPORAL_ADDRESS="${TEMPORAL_ADDRESS:-localhost:7233}"
CLAUDE_TEMPORAL_NAMESPACE="${CLAUDE_TEMPORAL_NAMESPACE:-default}"
usage() {
cat <<'USAGE'
Usage: list-recent-workflows.sh [options]
List recently completed/terminated workflows within a time window.
Options:
--minutes <N> Look back N minutes (default: 5)
--status Filter by status: COMPLETED, FAILED, CANCELED, TERMINATED, TIMED_OUT (optional)
--workflow-type Filter by workflow type (optional)
-h, --help Show this help
Examples:
# List all workflows from last 5 minutes
./tools/list-recent-workflows.sh
# List failed workflows from last 10 minutes
./tools/list-recent-workflows.sh --minutes 10 --status FAILED
# List completed workflows of specific type from last 2 minutes
./tools/list-recent-workflows.sh --minutes 2 --status COMPLETED --workflow-type MyWorkflow
Output format:
WORKFLOW_ID STATUS WORKFLOW_TYPE CLOSE_TIME
USAGE
}
minutes=5
status=""
workflow_type=""
while [[ $# -gt 0 ]]; do
case "$1" in
--minutes) minutes="${2-}"; shift 2 ;;
--status) status="${2-}"; shift 2 ;;
--workflow-type) workflow_type="${2-}"; shift 2 ;;
-h|--help) usage; exit 0 ;;
*) echo "Unknown option: $1" >&2; usage; exit 1 ;;
esac
done
if ! command -v "$TEMPORAL_CLI" >/dev/null 2>&1; then
echo "Temporal CLI not found: $TEMPORAL_CLI" >&2
exit 1
fi
# Validate status if provided
if [[ -n "$status" ]]; then
case "$status" in
COMPLETED|FAILED|CANCELED|TERMINATED|TIMED_OUT) ;;
*) echo "Invalid status: $status" >&2; usage; exit 1 ;;
esac
fi
# Calculate time threshold (minutes ago)
if [[ "$OSTYPE" == "darwin"* ]]; then
# macOS
time_threshold=$(date -u -v-"${minutes}M" +"%Y-%m-%dT%H:%M:%SZ")
else
# Linux
time_threshold=$(date -u -d "$minutes minutes ago" +"%Y-%m-%dT%H:%M:%SZ")
fi
# Build query
query="CloseTime > \"$time_threshold\""
if [[ -n "$status" ]]; then
query="$query AND ExecutionStatus = \"$status\""
fi
if [[ -n "$workflow_type" ]]; then
query="$query AND WorkflowType = \"$workflow_type\""
fi
echo "Searching workflows from last $minutes minute(s)..."
echo "Query: $query"
echo ""
# Execute list command
if ! workflow_list=$("$TEMPORAL_CLI" workflow list \
--address "$TEMPORAL_ADDRESS" \
--namespace "$CLAUDE_TEMPORAL_NAMESPACE" \
--query "$query" \
--fields long 2>&1); then
echo "Failed to list workflows" >&2
echo "$workflow_list" >&2
exit 1
fi
# Check if any workflows found
if echo "$workflow_list" | grep -q "No workflows found"; then
echo "No workflows found in the last $minutes minute(s)"
exit 0
fi
# Parse and display results
echo "$workflow_list" | head -n 50
# Count results
workflow_count=$(echo "$workflow_list" | awk 'NR>1 && $1 != "" && $1 !~ /^-+$/ {print $1}' | wc -l | tr -d ' ')
echo ""
echo "Found $workflow_count workflow(s)"
echo ""
echo "To get workflow result:"
echo " ./tools/get-workflow-result.sh --workflow-id <workflow-id>"
```
---
## Skill Companion Files
> Additional files collected from the skill directory layout.
### scripts/bulk-cancel-workflows.sh
```bash
#!/usr/bin/env bash
set -euo pipefail
# Environment variables with defaults
TEMPORAL_CLI="${TEMPORAL_CLI:-temporal}"
TEMPORAL_ADDRESS="${TEMPORAL_ADDRESS:-localhost:7233}"
CLAUDE_TEMPORAL_NAMESPACE="${CLAUDE_TEMPORAL_NAMESPACE:-default}"
usage() {
cat <<'USAGE'
Usage: bulk-cancel-workflows.sh [options]
Cancel multiple workflows.
Options:
--workflow-ids file containing workflow IDs (one per line), required unless --pattern
--pattern cancel workflows matching pattern (regex)
--reason cancellation reason (default: "Bulk cancellation")
-h, --help show this help
Examples:
# Cancel workflows from file
./bulk-cancel-workflows.sh --workflow-ids stalled.txt
# Cancel workflows matching pattern
./bulk-cancel-workflows.sh --pattern "test-.*"
# Cancel with custom reason
./bulk-cancel-workflows.sh --workflow-ids stalled.txt --reason "Cleaning up test workflows"
USAGE
}
workflow_ids_file=""
pattern=""
reason="Bulk cancellation"
while [[ $# -gt 0 ]]; do
case "$1" in
--workflow-ids) workflow_ids_file="${2-}"; shift 2 ;;
--pattern) pattern="${2-}"; shift 2 ;;
--reason) reason="${2-}"; shift 2 ;;
-h|--help) usage; exit 0 ;;
*) echo "Unknown option: $1" >&2; usage; exit 1 ;;
esac
done
if [[ -z "$workflow_ids_file" && -z "$pattern" ]]; then
echo "Either --workflow-ids or --pattern is required" >&2
usage
exit 1
fi
if ! command -v "$TEMPORAL_CLI" >/dev/null 2>&1; then
echo "Temporal CLI not found: $TEMPORAL_CLI" >&2
exit 1
fi
# Collect workflow IDs
workflow_ids=()
if [[ -n "$workflow_ids_file" ]]; then
if [[ ! -f "$workflow_ids_file" ]]; then
echo "File not found: $workflow_ids_file" >&2
exit 1
fi
# Read workflow IDs from file
while IFS= read -r line; do
# Skip empty lines and comments
[[ -z "$line" || "$line" =~ ^[[:space:]]*# ]] && continue
# Trim whitespace
line=$(echo "$line" | xargs)
workflow_ids+=("$line")
done < "$workflow_ids_file"
fi
if [[ -n "$pattern" ]]; then
echo "Finding workflows matching pattern: $pattern"
# List workflows and filter by pattern
LIST_CMD=("$TEMPORAL_CLI" "workflow" "list" "--address" "$TEMPORAL_ADDRESS" "--namespace" "$CLAUDE_TEMPORAL_NAMESPACE")
if workflow_list=$("${LIST_CMD[@]}" 2>&1); then
# Parse workflow IDs from list and filter by pattern
while IFS= read -r wf_id; do
[[ -z "$wf_id" ]] && continue
if echo "$wf_id" | grep -E "$pattern" >/dev/null 2>&1; then
workflow_ids+=("$wf_id")
fi
done < <(echo "$workflow_list" | awk 'NR>1 && $1 != "" {print $1}' | grep -v "^-")
else
echo "Failed to list workflows" >&2
echo "$workflow_list" >&2
exit 1
fi
fi
# Check if we have any workflow IDs
if [[ "${#workflow_ids[@]}" -eq 0 ]]; then
echo "No workflows to cancel"
exit 0
fi
echo "Found ${#workflow_ids[@]} workflow(s) to cancel"
echo "Reason: $reason"
echo ""
# Confirm with user
read -p "Continue with cancellation? (y/N) " -n 1 -r
echo
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
echo "Cancellation aborted"
exit 0
fi
echo ""
echo "Canceling workflows..."
echo ""
success_count=0
failed_count=0
# Cancel each workflow
for workflow_id in "${workflow_ids[@]}"; do
echo -n "Canceling: $workflow_id ... "
if "$TEMPORAL_CLI" workflow cancel \
--workflow-id "$workflow_id" \
--address "$TEMPORAL_ADDRESS" \
--namespace "$CLAUDE_TEMPORAL_NAMESPACE" \
--reason "$reason" \
>/dev/null 2>&1; then
echo "✓"
success_count=$((success_count + 1))
else
echo "❌ (may already be canceled or not exist)"
failed_count=$((failed_count + 1))
fi
done
echo ""
echo "=== Summary ==="
echo "Successfully canceled: $success_count"
echo "Failed: $failed_count"
echo "Total: ${#workflow_ids[@]}"
```
### scripts/find-project-workers.sh
```bash
#!/usr/bin/env bash
# Helper function to find all worker processes for a specific project
# This can be sourced by other scripts or run directly
# Usage: find_project_workers PROJECT_DIR
# Returns: PIDs of all worker processes for the project (one per line)
find_project_workers() {
local project_dir="$1"
# Normalize the project directory path (resolve symlinks, remove trailing slash)
project_dir="$(cd "$project_dir" 2>/dev/null && pwd)" || {
echo "Error: Invalid project directory: $project_dir" >&2
return 1
}
# Find all processes where:
# 1. Command contains the project directory path
# 2. Command contains "worker" (either .venv/bin/worker or "uv run worker")
# We need to be specific to avoid killing unrelated processes
# Strategy: Find both parent "uv run worker" processes and child Python worker processes
# We'll use the project directory in the path as the key identifier
local pids=()
# Use ps to get all processes with their commands
if [[ "$(uname)" == "Darwin" ]]; then
# macOS - find Python workers
while IFS= read -r pid; do
[[ -n "$pid" ]] && pids+=("$pid")
done < <(ps ax -o pid,command | grep -E "\.venv/bin/(python[0-9.]*|worker)" | grep -E "${project_dir}" | grep -v grep | awk '{print $1}')
# Also find "uv run worker" processes in this directory
while IFS= read -r pid; do
[[ -n "$pid" ]] && pids+=("$pid")
done < <(ps ax -o pid,command | grep "uv run worker" | grep -v grep | awk -v dir="$project_dir" '{
# Check if process is running from the project directory by checking cwd
cmd = "lsof -a -p " $1 " -d cwd -Fn 2>/dev/null | grep ^n | cut -c2-"
cmd | getline cwd
close(cmd)
if (index(cwd, dir) > 0) print $1
}')
else
# Linux - find Python workers
while IFS= read -r pid; do
[[ -n "$pid" ]] && pids+=("$pid")
done < <(ps ax -o pid,cmd | grep -E "\.venv/bin/(python[0-9.]*|worker)" | grep -E "${project_dir}" | grep -v grep | awk '{print $1}')
# Also find "uv run worker" processes in this directory
while IFS= read -r pid; do
[[ -n "$pid" ]] && pids+=("$pid")
done < <(ps ax -o pid,cmd | grep "uv run worker" | grep -v grep | awk -v dir="$project_dir" '{
# Check if process is running from the project directory
cmd = "readlink -f /proc/" $1 "/cwd 2>/dev/null"
cmd | getline cwd
close(cmd)
if (index(cwd, dir) > 0) print $1
}')
fi
# Print unique PIDs
printf "%s\n" "${pids[@]}" | sort -u
}
# If script is executed directly (not sourced), run the function
if [[ "${BASH_SOURCE[0]}" == "${0}" ]]; then
if [[ $# -eq 0 ]]; then
# Default to current directory
find_project_workers "$(pwd)"
else
find_project_workers "$1"
fi
fi
```
### scripts/wait-for-worker-ready.sh
```bash
#!/usr/bin/env bash
set -euo pipefail
usage() {
cat <<'USAGE'
Usage: wait-for-worker-ready.sh --log-file file [options]
Poll worker log file for startup confirmation.
Options:
--log-file path to worker log file, required
-p, --pattern regex pattern to look for (default: "Worker started")
-F, --fixed treat pattern as a fixed string (grep -F)
-T, --timeout seconds to wait (integer, default: 30)
-i, --interval poll interval in seconds (default: 0.5)
-h, --help show this help
USAGE
}
log_file=""
pattern="Worker started"
grep_flag="-E"
timeout=30
interval=0.5
while [[ $# -gt 0 ]]; do
case "$1" in
--log-file) log_file="${2-}"; shift 2 ;;
-p|--pattern) pattern="${2-}"; shift 2 ;;
-F|--fixed) grep_flag="-F"; shift ;;
-T|--timeout) timeout="${2-}"; shift 2 ;;
-i|--interval) interval="${2-}"; shift 2 ;;
-h|--help) usage; exit 0 ;;
*) echo "Unknown option: $1" >&2; usage; exit 1 ;;
esac
done
if [[ -z "$log_file" ]]; then
echo "log-file is required" >&2
usage
exit 1
fi
if ! [[ "$timeout" =~ ^[0-9]+$ ]]; then
echo "timeout must be an integer number of seconds" >&2
exit 1
fi
# End time in epoch seconds
start_epoch=$(date +%s)
deadline=$((start_epoch + timeout))
while true; do
# Check if log file exists and has content
if [[ -f "$log_file" ]]; then
log_content="$(cat "$log_file" 2>/dev/null || true)"
if [[ -n "$log_content" ]] && printf '%s\n' "$log_content" | grep $grep_flag -- "$pattern" >/dev/null 2>&1; then
exit 0
fi
fi
now=$(date +%s)
if (( now >= deadline )); then
echo "Timed out after ${timeout}s waiting for pattern: $pattern" >&2
if [[ -f "$log_file" ]]; then
echo "Last content from $log_file:" >&2
tail -n 50 "$log_file" >&2 || true
else
echo "Log file not found: $log_file" >&2
fi
exit 1
fi
sleep "$interval"
done
```