Back to skills
SkillHub ClubShip Full StackFull StackBackendIntegration

claude-api

Build with Claude Messages API using structured outputs for guaranteed JSON schema validation. Covers prompt caching (90% savings), streaming SSE, tool use, and model deprecations. Prevents 16 documented errors. Use when: building chatbots/agents, troubleshooting rate_limit_error, prompt caching issues, streaming SSE parsing errors, MCP timeout issues, or structured output hallucinations.

Packaged view

This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.

Stars
1
Hot score
77
Updated
March 20, 2026
Overall rating
C0.4
Composite score
0.4
Best-practice grade
F37.6

Install command

npx @skill-hub/cli install ma1orek-replay-claude-api

Repository

ma1orek/replay

Skill path: skills/claude-api

Build with Claude Messages API using structured outputs for guaranteed JSON schema validation. Covers prompt caching (90% savings), streaming SSE, tool use, and model deprecations. Prevents 16 documented errors. Use when: building chatbots/agents, troubleshooting rate_limit_error, prompt caching issues, streaming SSE parsing errors, MCP timeout issues, or structured output hallucinations.

Open repository

Best for

Primary workflow: Ship Full Stack.

Technical facets: Full Stack, Backend, Integration.

Target audience: everyone.

License: Unknown.

Original source

Catalog source: SkillHub Club.

Repository owner: ma1orek.

This is still a mirrored public skill entry. Review the repository before installing into production workflows.

What it helps with

  • Install claude-api into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
  • Review https://github.com/ma1orek/replay before adding claude-api to shared team environments
  • Use claude-api for development workflows

Works across

Claude CodeCodex CLIGemini CLIOpenCode

Favorites: 0.

Sub-skills: 0.

Aggregator: No.

Original source / Raw SKILL.md

---
name: claude-api
description: |
  Build with Claude Messages API using structured outputs for guaranteed JSON schema validation. Covers prompt caching (90% savings), streaming SSE, tool use, and model deprecations. Prevents 16 documented errors.

  Use when: building chatbots/agents, troubleshooting rate_limit_error, prompt caching issues, streaming SSE parsing errors, MCP timeout issues, or structured output hallucinations.
user-invocable: true
---

# Claude API - Structured Outputs & Error Prevention Guide

**Package**: @anthropic-ai/[email protected]
**Breaking Changes**: Oct 2025 - Claude 3.5/3.7 models retired, Nov 2025 - Structured outputs beta
**Last Updated**: 2026-01-09

---

## What's New in v0.69.0+ (Nov 2025)

**Major Features:**

### 1. Structured Outputs (v0.69.0, Nov 14, 2025) - CRITICAL ⭐

**Guaranteed JSON schema conformance** - Claude's responses strictly follow your JSON schema with two modes.

**⚠️ ACCURACY CAVEAT**: Structured outputs guarantee format compliance, NOT accuracy. Models can still hallucinate—you get "perfectly formatted incorrect answers." Always validate semantic correctness (see below).

**JSON Outputs (`output_format`)** - For data extraction and formatting:
```typescript
import Anthropic from '@anthropic-ai/sdk';

const anthropic = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

const message = await anthropic.messages.create({
  model: 'claude-sonnet-4-5-20250929',
  max_tokens: 1024,
  messages: [{ role: 'user', content: 'Extract contact info: John Doe, [email protected], 555-1234' }],
  betas: ['structured-outputs-2025-11-13'],
  output_format: {
    type: 'json_schema',
    json_schema: {
      name: 'Contact',
      strict: true,
      schema: {
        type: 'object',
        properties: {
          name: { type: 'string' },
          email: { type: 'string' },
          phone: { type: 'string' }
        },
        required: ['name', 'email', 'phone'],
        additionalProperties: false
      }
    }
  }
});

// Guaranteed valid JSON matching schema
const contact = JSON.parse(message.content[0].text);
console.log(contact.name); // "John Doe"
```

**Strict Tool Use (`strict: true`)** - For validated function parameters:
```typescript
const message = await anthropic.messages.create({
  model: 'claude-sonnet-4-5-20250929',
  max_tokens: 1024,
  messages: [{ role: 'user', content: 'Get weather for San Francisco' }],
  betas: ['structured-outputs-2025-11-13'],
  tools: [{
    name: 'get_weather',
    description: 'Get current weather',
    input_schema: {
      type: 'object',
      properties: {
        location: { type: 'string' },
        unit: { type: 'string', enum: ['celsius', 'fahrenheit'] }
      },
      required: ['location'],
      additionalProperties: false
    },
    strict: true  // ← Guarantees schema compliance
  }]
});
```

**Requirements:**
- **Beta header**: `structured-outputs-2025-11-13` (via `betas` array)
- **Models**: Claude Opus 4.5, Claude Sonnet 4.5, Claude Opus 4 (best models only)
- **SDK**: v0.69.0+ required

**Limitations:**
- ❌ No recursive schemas
- ❌ No numerical constraints (`minimum`, `maximum`)
- ❌ Limited regex support (no backreferences/lookahead)
- ❌ Incompatible with citations and message prefilling
- ⚠️ Grammar compilation adds latency on first request (cached 24hrs)

**Performance Characteristics:**
- **First request**: +200-500ms latency for grammar compilation
- **Subsequent requests**: Normal latency (grammar cached for 24 hours)
- **Cache sharing**: Only with IDENTICAL schemas (small changes = recompilation)

**Pre-warming critical schemas:**
```typescript
// Pre-compile schemas during server startup
const warmupMessage = await anthropic.messages.create({
  model: 'claude-sonnet-4-5-20250929',
  max_tokens: 10,
  messages: [{ role: 'user', content: 'warmup' }],
  betas: ['structured-outputs-2025-11-13'],
  output_format: {
    type: 'json_schema',
    json_schema: YOUR_CRITICAL_SCHEMA
  }
});
// Later requests use cached grammar
```

**Semantic Validation (CRITICAL):**
```typescript
const message = await anthropic.messages.create({
  model: 'claude-sonnet-4-5-20250929',
  messages: [{ role: 'user', content: 'Extract contact: John Doe' }],
  betas: ['structured-outputs-2025-11-13'],
  output_format: {
    type: 'json_schema',
    json_schema: contactSchema
  }
});

const contact = JSON.parse(message.content[0].text);

// ✅ Format is guaranteed valid
// ❌ Content may be hallucinated

// ALWAYS validate semantic correctness
if (!isValidEmail(contact.email)) {
  throw new Error('Hallucinated email detected');
}
if (contact.age < 0 || contact.age > 120) {
  throw new Error('Implausible age value');
}
```

**When to Use:**
- Data extraction from unstructured text
- API response formatting
- Agentic workflows requiring validated tool inputs
- Eliminating JSON parse errors

**⚠️ SDK v0.71.1+ Deprecation**: Direct `.parsed` property access is deprecated. Check SDK docs for updated API.

### 2. Model Changes (Oct 2025) - BREAKING

**Retired (return errors):**
- ❌ Claude 3.5 Sonnet (all versions)
- ❌ Claude 3.7 Sonnet - DEPRECATED (Oct 28, 2025)

**Active Models (Jan 2026):**

| Model | ID | Context | Best For | Cost (per MTok) |
|-------|-----|---------|----------|-----------------|
| **Claude Opus 4.5** | claude-opus-4-5-20251101 | 200k | Flagship - best reasoning, coding, agents | $5/$25 (in/out) |
| **Claude Sonnet 4.5** | claude-sonnet-4-5-20250929 | 200k | Balanced performance | $3/$15 (in/out) |
| **Claude Opus 4** | claude-opus-4-20250514 | 200k | High capability | $15/$75 |
| **Claude Haiku 4.5** | claude-haiku-4-5-20250929 | 200k | Near-frontier, fast | $1/$5 |

**Note**: Claude 3.x models (3.5 Sonnet, 3.7 Sonnet, etc.) are deprecated. Use Claude 4.x+ models.

### 3. Context Management (Oct 28, 2025)

**Clear Thinking Blocks** - Automatic thinking block cleanup:
```typescript
const message = await anthropic.messages.create({
  model: 'claude-sonnet-4-5-20250929',
  max_tokens: 4096,
  messages: [{ role: 'user', content: 'Solve complex problem' }],
  betas: ['clear_thinking_20251015']
});
// Thinking blocks automatically managed
```

### 4. Agent Skills API (Oct 16, 2025)

Pre-built skills for Office files (PowerPoint, Excel, Word, PDF):
```typescript
const message = await anthropic.messages.create({
  model: 'claude-sonnet-4-5-20250929',
  max_tokens: 1024,
  messages: [{ role: 'user', content: 'Analyze this spreadsheet' }],
  betas: ['skills-2025-10-02'],
  // Requires code execution tool enabled
});
```

📚 **Docs**: https://platform.claude.com/docs/en/build-with-claude/structured-outputs

---

## Streaming Responses (SSE)

**CRITICAL Error Pattern** - Errors occur AFTER initial 200 response:
```typescript
const stream = anthropic.messages.stream({
  model: 'claude-sonnet-4-5-20250929',
  max_tokens: 1024,
  messages: [{ role: 'user', content: 'Hello' }],
});

stream
  .on('error', (error) => {
    // Error can occur AFTER stream starts
    console.error('Stream error:', error);
    // Implement fallback or retry logic
  })
  .on('abort', (error) => {
    console.warn('Stream aborted:', error);
  });
```

**Why this matters**: Unlike regular HTTP errors, SSE errors happen mid-stream after 200 OK, requiring error event listeners

---

## Prompt Caching (⭐ 90% Cost Savings)

**CRITICAL Rule** - `cache_control` MUST be on LAST block:
```typescript
const message = await anthropic.messages.create({
  model: 'claude-sonnet-4-5-20250929',
  max_tokens: 1024,
  system: [
    {
      type: 'text',
      text: 'System instructions...',
    },
    {
      type: 'text',
      text: LARGE_CODEBASE, // 50k tokens
      cache_control: { type: 'ephemeral' }, // ← MUST be on LAST block
    },
  ],
  messages: [{ role: 'user', content: 'Explain auth module' }],
});

// Monitor cache usage
console.log('Cache reads:', message.usage.cache_read_input_tokens);
console.log('Cache writes:', message.usage.cache_creation_input_tokens);
```

**Minimum requirements:**
- Claude Sonnet 4.5: 1,024 tokens minimum
- Claude Haiku 4.5: 2,048 tokens minimum
- 5-minute TTL (refreshes on each use)
- Cache shared only with IDENTICAL content

**⚠️ AWS Bedrock Limitation**: Prompt caching does NOT work for Claude 4 family on AWS Bedrock (works for Claude 3.7 Sonnet only). Use direct Anthropic API for Claude 4 caching support. ([GitHub Issue #1347](https://github.com/anthropics/claude-code/issues/1347))

---

## Tool Use (Function Calling)

**CRITICAL Patterns:**

**Strict Tool Use** (with structured outputs):
```typescript
const message = await anthropic.messages.create({
  model: 'claude-sonnet-4-5-20250929',
  max_tokens: 1024,
  betas: ['structured-outputs-2025-11-13'],
  tools: [{
    name: 'get_weather',
    description: 'Get weather data',
    input_schema: {
      type: 'object',
      properties: {
        location: { type: 'string' },
        unit: { type: 'string', enum: ['celsius', 'fahrenheit'] }
      },
      required: ['location'],
      additionalProperties: false
    },
    strict: true  // ← Guarantees schema compliance
  }],
  messages: [{ role: 'user', content: 'Weather in NYC?' }]
});
```

**Tool Result Pattern** - `tool_use_id` MUST match:
```typescript
const toolResults = [];
for (const block of response.content) {
  if (block.type === 'tool_use') {
    const result = await executeToolFunction(block.name, block.input);

    toolResults.push({
      type: 'tool_result',
      tool_use_id: block.id,  // ← MUST match tool_use block id
      content: JSON.stringify(result),
    });
  }
}

messages.push({
  role: 'user',
  content: toolResults,
});
```

**Error Handling** - Handle tool execution failures:
```typescript
try {
  const result = await executeToolFunction(block.name, block.input);
  toolResults.push({
    type: 'tool_result',
    tool_use_id: block.id,
    content: JSON.stringify(result),
  });
} catch (error) {
  // Return error to Claude for handling
  toolResults.push({
    type: 'tool_result',
    tool_use_id: block.id,
    is_error: true,
    content: `Tool execution failed: ${error.message}`,
  });
}
```

**Content Sanitization** - Handle Unicode edge cases:
```typescript
// U+2028 (LINE SEPARATOR) and U+2029 (PARAGRAPH SEPARATOR) cause JSON parse failures
function sanitizeToolResult(content: string): string {
  return content
    .replace(/\u2028/g, '\n') // LINE SEPARATOR → newline
    .replace(/\u2029/g, '\n'); // PARAGRAPH SEPARATOR → newline
}

const toolResult = {
  type: 'tool_result',
  tool_use_id: block.id,
  content: sanitizeToolResult(result) // Sanitize before sending
};
```
([GitHub Issue #882](https://github.com/anthropics/anthropic-sdk-typescript/issues/882))

---

## Vision (Image Understanding)

**CRITICAL Rules:**
- **Formats**: JPEG, PNG, WebP, GIF (non-animated)
- **Max size**: 5MB per image
- **Base64 overhead**: ~33% size increase
- **Context impact**: Images count toward token limit
- **Caching**: Consider for repeated image analysis

**Format validation** - Check before encoding:
```typescript
const validFormats = ['image/jpeg', 'image/png', 'image/webp', 'image/gif'];
if (!validFormats.includes(mimeType)) {
  throw new Error(`Unsupported format: ${mimeType}`);
}
```

---

## Extended Thinking Mode

**⚠️ Model Compatibility:**
- ❌ Claude 3.7 Sonnet - DEPRECATED (Oct 28, 2025)
- ❌ Claude 3.5 Sonnet - RETIRED (not supported)
- ✅ Claude Opus 4.5 - Extended thinking supported (flagship)
- ✅ Claude Sonnet 4.5 - Extended thinking supported
- ✅ Claude Opus 4 - Extended thinking supported

**CRITICAL:**
- Thinking blocks are NOT cacheable
- Requires higher `max_tokens` (thinking consumes tokens)
- Check model before expecting thinking blocks

---

## Rate Limits

**CRITICAL Pattern** - Respect `retry-after` header with exponential backoff:
```typescript
async function makeRequestWithRetry(
  requestFn: () => Promise<any>,
  maxRetries = 3,
  baseDelay = 1000
): Promise<any> {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      return await requestFn();
    } catch (error) {
      if (error.status === 429) {
        // CRITICAL: Use retry-after header if present
        const retryAfter = error.response?.headers?.['retry-after'];
        const delay = retryAfter
          ? parseInt(retryAfter) * 1000
          : baseDelay * Math.pow(2, attempt);

        console.warn(`Rate limited. Retrying in ${delay}ms...`);
        await new Promise(resolve => setTimeout(resolve, delay));
      } else {
        throw error;
      }
    }
  }
  throw new Error('Max retries exceeded');
}
```

**Rate limit headers:**
- `anthropic-ratelimit-requests-limit` - Total RPM allowed
- `anthropic-ratelimit-requests-remaining` - Remaining requests
- `anthropic-ratelimit-requests-reset` - Reset timestamp

---

## Error Handling

**Common Error Codes:**

| Status | Error Type | Cause | Solution |
|--------|-----------|-------|----------|
| 400 | invalid_request_error | Bad parameters | Validate request body |
| 401 | authentication_error | Invalid API key | Check env variable |
| 403 | permission_error | No access to feature | Check account tier |
| 404 | not_found_error | Invalid endpoint | Check API version |
| 429 | rate_limit_error | Too many requests | Implement retry logic |
| 500 | api_error | Internal error | Retry with backoff |
| 529 | overloaded_error | System overloaded | Retry later |

**CRITICAL:**
- Streaming errors occur AFTER initial 200 response
- Always implement error event listeners for streams
- Respect `retry-after` header on 429 errors
- Have fallback strategies for critical operations

---

## Known Issues Prevention

This skill prevents **16** documented issues:

### Issue #1: Rate Limit 429 Errors Without Backoff
**Error**: `429 Too Many Requests: Number of request tokens has exceeded your per-minute rate limit`
**Source**: https://docs.claude.com/en/api/errors
**Why It Happens**: Exceeding RPM, TPM, or daily token limits
**Prevention**: Implement exponential backoff with `retry-after` header respect

### Issue #2: Streaming SSE Parsing Errors
**Error**: Incomplete chunks, malformed SSE events
**Source**: Common SDK issue (GitHub #323)
**Why It Happens**: Network interruptions, improper event parsing
**Prevention**: Use SDK stream helpers, implement error event listeners

### Issue #3: Prompt Caching Not Activating
**Error**: High costs despite cache_control blocks
**Source**: https://platform.claude.com/docs/en/build-with-claude/prompt-caching
**Why It Happens**: `cache_control` placed incorrectly (must be at END)
**Prevention**: Always place `cache_control` on LAST block of cacheable content

### Issue #4: Tool Use Response Format Errors
**Error**: `invalid_request_error: tools[0].input_schema is invalid`
**Source**: API validation errors
**Why It Happens**: Invalid JSON Schema, missing required fields
**Prevention**: Validate schemas with JSON Schema validator, test thoroughly

### Issue #5: Vision Image Format Issues
**Error**: `invalid_request_error: image source must be base64 or url`
**Source**: API documentation
**Why It Happens**: Incorrect encoding, unsupported formats
**Prevention**: Validate format (JPEG/PNG/WebP/GIF), proper base64 encoding

### Issue #6: Token Counting Mismatches for Billing
**Error**: Unexpected high costs, context window exceeded
**Source**: Token counting differences
**Why It Happens**: Not accounting for special tokens, formatting
**Prevention**: Use official token counter, monitor usage headers

### Issue #7: System Prompt Ordering Issues
**Error**: System prompt ignored or overridden
**Source**: API behavior
**Why It Happens**: System prompt placed after messages array
**Prevention**: ALWAYS place system prompt before messages

### Issue #8: Context Window Exceeded (200k)
**Error**: `invalid_request_error: messages: too many tokens`
**Source**: Model limits
**Why It Happens**: Long conversations without pruning
**Prevention**: Implement message history pruning, use caching

### Issue #9: Extended Thinking on Wrong Model
**Error**: No thinking blocks in response
**Source**: Model capabilities
**Why It Happens**: Using retired/deprecated models (3.5/3.7 Sonnet)
**Prevention**: Only use extended thinking with Claude Opus 4.5, Claude Sonnet 4.5, or Claude Opus 4

### Issue #10: API Key Exposure in Client Code
**Error**: CORS errors, security vulnerability
**Source**: Security best practices
**Why It Happens**: Making API calls from browser
**Prevention**: Server-side only, use environment variables

### Issue #11: Rate Limit Tier Confusion
**Error**: Lower limits than expected
**Source**: Account tier system
**Why It Happens**: Not understanding tier progression
**Prevention**: Check Console for current tier, auto-scales with usage

### Issue #12: Message Batches Beta Headers Missing
**Error**: `invalid_request_error: unknown parameter: batches`
**Source**: Beta API requirements
**Why It Happens**: Missing `anthropic-beta` header
**Prevention**: Include `anthropic-beta: message-batches-2024-09-24` header

### Issue #13: Stream Errors Not Catchable with .withResponse() (Fixed in v0.71.2)
**Error**: Unhandled promise rejection when using `messages.stream().withResponse()`
**Source**: [GitHub Issue #856](https://github.com/anthropics/anthropic-sdk-typescript/issues/856)
**Why It Happens**: SDK internal error handling prevented user catch blocks from working (pre-v0.71.2)
**Prevention**: Upgrade to v0.71.2+ or use event listeners instead

**Fixed in v0.71.2+**:
```typescript
try {
  const stream = await anthropic.messages.stream({
    model: 'claude-sonnet-4-5-20250929',
    max_tokens: 1024,
    messages: [{ role: 'user', content: 'Hello' }]
  }).withResponse();
} catch (error) {
  // Now properly catchable in v0.71.2+
  console.error('Stream error:', error);
}
```

**Workaround for pre-v0.71.2**:
```typescript
const stream = anthropic.messages.stream({
  model: 'claude-sonnet-4-5-20250929',
  max_tokens: 1024,
  messages: [{ role: 'user', content: 'Hello' }]
});

stream.on('error', (error) => {
  console.error('Stream error:', error);
});
```

### Issue #14: MCP Tool Connections Cause 2-Minute Timeout
**Error**: `Connection error` / `499 Client disconnected` after ~121 seconds
**Source**: [GitHub Issue #842](https://github.com/anthropics/anthropic-sdk-typescript/issues/842)
**Why It Happens**: MCP server connection management conflicts with long-running requests, even when MCP tools are not actively used
**Prevention**: Use direct toolRunner instead of MCP for requests >2 minutes

**Symptoms**:
- Request works fine without MCP
- Fails at exactly ~121 seconds with MCP registered
- Dashboard shows: "Client disconnected (code 499)"
- Multiple users confirmed across streaming and non-streaming

**Workaround**:
```typescript
// Don't use MCP for long requests
const message = await anthropic.beta.messages.toolRunner({
  model: 'claude-sonnet-4-5-20250929',
  max_tokens: 4096,
  messages: [{ role: 'user', content: 'Long task >2 min' }],
  tools: [customTools] // Direct tool definitions, not MCP
});
```

**Note**: This is a known limitation with no official fix. Consider architecture changes if long-running requests with tools are required.

### Issue #15: Structured Outputs Hallucination Risk
**Error**: Valid JSON format but incorrect/hallucinated content
**Source**: [Structured Outputs Docs](https://platform.claude.com/docs/en/build-with-claude/structured-outputs)
**Why It Happens**: Structured outputs guarantee format compliance, NOT accuracy
**Prevention**: Always validate semantic correctness, not just format

```typescript
const message = await anthropic.messages.create({
  model: 'claude-sonnet-4-5-20250929',
  messages: [{ role: 'user', content: 'Extract contact: John Doe' }],
  betas: ['structured-outputs-2025-11-13'],
  output_format: {
    type: 'json_schema',
    json_schema: contactSchema
  }
});

const contact = JSON.parse(message.content[0].text);

// ✅ Format is guaranteed valid
// ❌ Content may be hallucinated

// CRITICAL: Validate semantic correctness
if (!isValidEmail(contact.email)) {
  throw new Error('Hallucinated email detected');
}
if (contact.age < 0 || contact.age > 120) {
  throw new Error('Implausible age value');
}
```

### Issue #16: U+2028 Line Separator in Tool Results (Community-sourced)
**Error**: JSON parsing failures or silent errors when tool results contain U+2028
**Source**: [GitHub Issue #882](https://github.com/anthropics/anthropic-sdk-typescript/issues/882)
**Why It Happens**: U+2028 is valid in JSON but not in JavaScript string literals
**Prevention**: Sanitize tool results before passing to SDK

```typescript
function sanitizeToolResult(content: string): string {
  return content
    .replace(/\u2028/g, '\n') // LINE SEPARATOR → newline
    .replace(/\u2029/g, '\n'); // PARAGRAPH SEPARATOR → newline
}

const toolResult = {
  type: 'tool_result',
  tool_use_id: block.id,
  content: sanitizeToolResult(result)
};
```

---

## Official Documentation

- **Claude API**: https://platform.claude.com/docs/en/api
- **Messages API**: https://platform.claude.com/docs/en/api/messages
- **Structured Outputs**: https://platform.claude.com/docs/en/build-with-claude/structured-outputs
- **Prompt Caching**: https://platform.claude.com/docs/en/build-with-claude/prompt-caching
- **Tool Use**: https://platform.claude.com/docs/en/build-with-claude/tool-use
- **Vision**: https://platform.claude.com/docs/en/build-with-claude/vision
- **Rate Limits**: https://platform.claude.com/docs/en/api/rate-limits
- **Errors**: https://platform.claude.com/docs/en/api/errors
- **TypeScript SDK**: https://github.com/anthropics/anthropic-sdk-typescript
- **Context7 Library ID**: /anthropics/anthropic-sdk-typescript

---

## Package Versions

**Latest**: @anthropic-ai/[email protected]

```json
{
  "dependencies": {
    "@anthropic-ai/sdk": "^0.71.2"
  },
  "devDependencies": {
    "@types/node": "^20.0.0",
    "typescript": "^5.3.0",
    "zod": "^3.23.0"
  }
}
```

---

**Token Efficiency**:
- **Without skill**: ~8,000 tokens (basic setup, streaming, caching, tools, vision, errors)
- **With skill**: ~4,200 tokens (knowledge gaps + error prevention + critical patterns)
- **Savings**: ~48% (~3,800 tokens)

**Errors prevented**: 16 documented issues with exact solutions
**Key value**: Structured outputs (v0.69.0+), model deprecations (Oct 2025), prompt caching edge cases, streaming error patterns, rate limit retry logic, MCP timeout workarounds, hallucination validation

---

**Last verified**: 2026-01-20 | **Skill version**: 2.2.0 | **Changes**: Added 4 new issues from community research: streaming error handling (fixed in v0.71.2), MCP timeout workaround, structured outputs hallucination validation, U+2028 sanitization; expanded structured outputs section with performance characteristics and accuracy caveats; added AWS Bedrock caching limitation


---

## Skill Companion Files

> Additional files collected from the skill directory layout.

### references/api-reference.md

```markdown
# Claude API Reference

Quick reference for the Anthropic Messages API endpoints and parameters.

## Base URL

```
https://api.anthropic.com/v1
```

## Authentication

All requests require:
```http
x-api-key: YOUR_API_KEY
anthropic-version: 2023-06-01
content-type: application/json
```

## Endpoints

### POST /messages

Create a message with Claude.

**Request Body:**

```typescript
{
  model: string,              // Required: "claude-sonnet-4-5-20250929", etc.
  max_tokens: number,         // Required: Maximum tokens to generate (1-8192)
  messages: Message[],        // Required: Conversation history
  system?: string | SystemBlock[],  // Optional: System prompt
  temperature?: number,       // Optional: 0-1 (default: 1)
  top_p?: number,             // Optional: 0-1 (default: 1)
  top_k?: number,             // Optional: Sampling parameter
  stop_sequences?: string[],  // Optional: Stop generation at these sequences
  stream?: boolean,           // Optional: Enable streaming (default: false)
  tools?: Tool[],             // Optional: Available tools
  tool_choice?: ToolChoice,   // Optional: Tool selection strategy
  metadata?: Metadata         // Optional: Request metadata
}
```

**Message Format:**

```typescript
{
  role: "user" | "assistant",
  content: string | ContentBlock[]
}
```

**Content Block Types:**

```typescript
// Text
{
  type: "text",
  text: string,
  cache_control?: { type: "ephemeral" }  // For prompt caching
}

// Image
{
  type: "image",
  source: {
    type: "base64" | "url",
    media_type: "image/jpeg" | "image/png" | "image/webp" | "image/gif",
    data?: string,  // base64 encoded
    url?: string    // publicly accessible URL
  },
  cache_control?: { type: "ephemeral" }
}

// Tool use (assistant messages only)
{
  type: "tool_use",
  id: string,
  name: string,
  input: object
}

// Tool result (user messages only)
{
  type: "tool_result",
  tool_use_id: string,
  content: string | ContentBlock[],
  is_error?: boolean
}
```

**Response:**

```typescript
{
  id: string,
  type: "message",
  role: "assistant",
  content: ContentBlock[],
  model: string,
  stop_reason: "end_turn" | "max_tokens" | "stop_sequence" | "tool_use",
  stop_sequence?: string,
  usage: {
    input_tokens: number,
    output_tokens: number,
    cache_creation_input_tokens?: number,
    cache_read_input_tokens?: number
  }
}
```

## Model IDs

| Model | ID | Context Window | Cost (per MTok) |
|-------|-----|----------------|-----------------|
| Claude Opus 4.5 (Flagship) | claude-opus-4-5-20251101 | 200k tokens | $5/$25 |
| Claude Sonnet 4.5 | claude-sonnet-4-5-20250929 | 200k tokens | $3/$15 |
| Claude Opus 4 | claude-opus-4-20250514 | 200k tokens | $15/$75 |
| Claude Haiku 4.5 | claude-haiku-4-5-20250929 | 200k tokens | $1/$5 |

**Deprecated**: All Claude 3.x models (3.5 Sonnet, 3.7 Sonnet, 3.5 Haiku) are deprecated. Use Claude 4.x+ models.

## Tool Definition

```typescript
{
  name: string,           // Tool identifier
  description: string,    // What the tool does
  input_schema: {         // JSON Schema
    type: "object",
    properties: {
      [key: string]: {
        type: "string" | "number" | "boolean" | "array" | "object",
        description?: string,
        enum?: any[]
      }
    },
    required?: string[]
  }
}
```

## Streaming

Set `stream: true` in request. Returns Server-Sent Events (SSE):

**Event Types:**
- `message_start`: Message begins
- `content_block_start`: Content block begins
- `content_block_delta`: Text or JSON delta
- `content_block_stop`: Content block complete
- `message_delta`: Metadata update
- `message_stop`: Message complete
- `ping`: Keep-alive

**Event Format:**

```
event: message_start
data: {"type":"message_start","message":{...}}

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":"Hello"}}

event: message_stop
data: {"type":"message_stop"}
```

## Error Responses

```typescript
{
  type: "error",
  error: {
    type: string,  // Error type identifier
    message: string  // Human-readable description
  }
}
```

**Common Error Types:**
- `invalid_request_error` (400)
- `authentication_error` (401)
- `permission_error` (403)
- `not_found_error` (404)
- `rate_limit_error` (429)
- `api_error` (500)
- `overloaded_error` (529)

## Rate Limit Headers

Response includes:

```
anthropic-ratelimit-requests-limit: 50
anthropic-ratelimit-requests-remaining: 49
anthropic-ratelimit-requests-reset: 2025-10-25T12:00:00Z
anthropic-ratelimit-tokens-limit: 50000
anthropic-ratelimit-tokens-remaining: 49500
anthropic-ratelimit-tokens-reset: 2025-10-25T12:01:00Z
retry-after: 60  // Only on 429 errors
```

## SDK Installation

```bash
# TypeScript/JavaScript
npm install @anthropic-ai/sdk

# Python
pip install anthropic

# Java
# See https://github.com/anthropics/anthropic-sdk-java

# Go
go get github.com/anthropics/anthropic-sdk-go
```

## Official Documentation

- **API Reference**: https://docs.claude.com/en/api/messages
- **SDK Documentation**: https://github.com/anthropics/anthropic-sdk-typescript
- **Rate Limits**: https://docs.claude.com/en/api/rate-limits
- **Errors**: https://docs.claude.com/en/api/errors

```

### references/prompt-caching-guide.md

```markdown
# Prompt Caching Guide

Complete guide to using prompt caching for cost optimization.

## Overview

Prompt caching reduces costs by up to 90% and latency by up to 85% by caching frequently used context.

### Benefits

- **Cost Savings**: Cache reads = 10% of input token price
- **Latency Reduction**: 85% faster time to first token
- **Use Cases**: Long documents, codebases, system instructions, conversation history

### Pricing

| Operation | Cost (per MTok) | vs Regular Input |
|-----------|-----------------|------------------|
| Regular input | $3 | 100% |
| Cache write | $3.75 | 125% |
| Cache read | $0.30 | 10% |

**Example**: 100k tokens cached, used 10 times
- Without caching: 100k × $3/MTok × 10 = $3.00
- With caching: (100k × $3.75/MTok) + (100k × $0.30/MTok × 9) = $0.375 + $0.27 = $0.645
- **Savings: $2.355 (78.5%)**

## Requirements

### Minimum Cacheable Content

- **Claude Opus 4.5**: 1,024 tokens minimum
- **Claude Sonnet 4.5**: 1,024 tokens minimum
- **Claude Opus 4**: 1,024 tokens minimum
- **Claude Haiku 4.5**: 2,048 tokens minimum

### Cache Lifetime

- **Default**: 5 minutes
- **Extended**: 1 hour (configurable)
- Refreshes on each use

### Cache Matching

Cache hits require:
- ✅ **Identical content** (byte-for-byte)
- ✅ **Same position** in request
- ✅ **Within TTL** (5 min or 1 hour)

## Implementation

### Basic System Prompt Caching

```typescript
const message = await anthropic.messages.create({
  model: 'claude-sonnet-4-5-20250929',
  max_tokens: 1024,
  system: [
    {
      type: 'text',
      text: LARGE_SYSTEM_INSTRUCTIONS, // >= 1024 tokens
      cache_control: { type: 'ephemeral' },
    },
  ],
  messages: [
    { role: 'user', content: 'Your question here' }
  ],
});
```

### Caching in User Messages

```typescript
const message = await anthropic.messages.create({
  model: 'claude-sonnet-4-5-20250929',
  max_tokens: 1024,
  messages: [
    {
      role: 'user',
      content: [
        {
          type: 'text',
          text: 'Analyze this document:',
        },
        {
          type: 'text',
          text: LARGE_DOCUMENT, // >= 1024 tokens
          cache_control: { type: 'ephemeral' },
        },
        {
          type: 'text',
          text: 'What are the main themes?',
        },
      ],
    },
  ],
});
```

### Multi-Turn Conversation Caching

```typescript
// Turn 1 - Creates cache
const response1 = await anthropic.messages.create({
  model: 'claude-sonnet-4-5-20250929',
  max_tokens: 1024,
  system: [
    {
      type: 'text',
      text: SYSTEM_PROMPT,
      cache_control: { type: 'ephemeral' },
    },
  ],
  messages: [
    { role: 'user', content: 'Hello!' }
  ],
});

// Turn 2 - Hits cache (same system prompt)
const response2 = await anthropic.messages.create({
  model: 'claude-sonnet-4-5-20250929',
  max_tokens: 1024,
  system: [
    {
      type: 'text',
      text: SYSTEM_PROMPT, // Identical - cache hit
      cache_control: { type: 'ephemeral' },
    },
  ],
  messages: [
    { role: 'user', content: 'Hello!' },
    { role: 'assistant', content: response1.content[0].text },
    { role: 'user', content: 'Tell me more' },
  ],
});
```

### Caching Conversation History

```typescript
const messages = [
  { role: 'user', content: 'Message 1' },
  { role: 'assistant', content: 'Response 1' },
  { role: 'user', content: 'Message 2' },
  { role: 'assistant', content: 'Response 2' },
];

// Cache last assistant message
messages[messages.length - 1] = {
  role: 'assistant',
  content: [
    {
      type: 'text',
      text: 'Response 2',
      cache_control: { type: 'ephemeral' },
    },
  ],
};

messages.push({ role: 'user', content: 'Message 3' });

const response = await anthropic.messages.create({
  model: 'claude-sonnet-4-5-20250929',
  max_tokens: 1024,
  messages,
});
```

## Best Practices

### ✅ Do

- Place `cache_control` on the **last block** of cacheable content
- Cache content >= 1024 tokens (Opus/Sonnet 4.x) or >= 2048 tokens (Haiku 4.5)
- Use caching for repeated context (system prompts, documents, code)
- Monitor cache usage in response headers
- Cache conversation history in long chats

### ❌ Don't

- Cache content below minimum token threshold
- Place `cache_control` in the middle of text
- Change cached content (breaks cache matching)
- Cache rarely used content (not cost-effective)
- Expect caching to work across different API keys

## Monitoring Cache Usage

```typescript
const response = await anthropic.messages.create({...});

console.log('Input tokens:', response.usage.input_tokens);
console.log('Cache creation:', response.usage.cache_creation_input_tokens);
console.log('Cache read:', response.usage.cache_read_input_tokens);
console.log('Output tokens:', response.usage.output_tokens);

// First request
// input_tokens: 1000
// cache_creation_input_tokens: 5000
// cache_read_input_tokens: 0

// Subsequent requests (within 5 min)
// input_tokens: 1000
// cache_creation_input_tokens: 0
// cache_read_input_tokens: 5000  // 90% cost savings!
```

## Common Patterns

### Pattern 1: Document Analysis Chatbot

```typescript
const document = fs.readFileSync('./document.txt', 'utf-8'); // 10k tokens

// All requests use same cached document
for (const question of questions) {
  const response = await anthropic.messages.create({
    model: 'claude-sonnet-4-5-20250929',
    max_tokens: 1024,
    messages: [
      {
        role: 'user',
        content: [
          { type: 'text', text: 'Document:' },
          {
            type: 'text',
            text: document,
            cache_control: { type: 'ephemeral' },
          },
          { type: 'text', text: `Question: ${question}` },
        ],
      },
    ],
  });

  // First request: cache_creation_input_tokens: 10000
  // Subsequent: cache_read_input_tokens: 10000 (90% savings)
}
```

### Pattern 2: Code Review with Codebase Context

```typescript
const codebase = await loadCodebase(); // 50k tokens

const review = await anthropic.messages.create({
  model: 'claude-sonnet-4-5-20250929',
  max_tokens: 2048,
  system: [
    { type: 'text', text: 'You are a code reviewer.' },
    {
      type: 'text',
      text: `Codebase context:\n${codebase}`,
      cache_control: { type: 'ephemeral' },
    },
  ],
  messages: [
    { role: 'user', content: 'Review this PR: ...' }
  ],
});
```

### Pattern 3: Customer Support with Knowledge Base

```typescript
const knowledgeBase = await loadKB(); // 20k tokens

// Cache persists across all customer queries
const response = await anthropic.messages.create({
  model: 'claude-sonnet-4-5-20250929',
  max_tokens: 1024,
  system: [
    { type: 'text', text: 'You are a customer support agent.' },
    {
      type: 'text',
      text: knowledgeBase,
      cache_control: { type: 'ephemeral' },
    },
  ],
  messages: customerConversation,
});
```

## Troubleshooting

### Cache Not Activating

**Problem**: `cache_read_input_tokens` is 0

**Solutions**:
1. Ensure content >= 1024 tokens (or 2048 for Haiku)
2. Verify `cache_control` is on **last block**
3. Check content is byte-for-byte identical
4. Confirm requests within 5-minute window

### Unexpected Cache Misses

**Problem**: Cache hits intermittently

**Solutions**:
1. Ensure content doesn't change (even whitespace)
2. Check TTL hasn't expired
3. Verify using same API key
4. Monitor cache headers in responses

### High Cache Creation Costs

**Problem**: Frequent `cache_creation_input_tokens`

**Solutions**:
1. Increase request frequency (use cache before expiry)
2. Consider if caching is cost-effective (need 2+ uses)
3. Extend cache TTL to 1 hour if supported

## Cost Calculator

```typescript
function calculateCachingSavings(
  cachedTokens: number,
  uncachedTokens: number,
  requestCount: number
): {
  withoutCaching: number;
  withCaching: number;
  savings: number;
  savingsPercent: number;
} {
  const inputCostPerMTok = 3;
  const cacheCostPerMTok = 3.75;
  const cacheReadCostPerMTok = 0.3;

  const withoutCaching = ((cachedTokens + uncachedTokens) / 1_000_000) *
    inputCostPerMTok * requestCount;

  const cacheWrite = (cachedTokens / 1_000_000) * cacheCostPerMTok;
  const cacheReads = (cachedTokens / 1_000_000) * cacheReadCostPerMTok * (requestCount - 1);
  const uncachedInput = (uncachedTokens / 1_000_000) * inputCostPerMTok * requestCount;
  const withCaching = cacheWrite + cacheReads + uncachedInput;

  const savings = withoutCaching - withCaching;
  const savingsPercent = (savings / withoutCaching) * 100;

  return { withoutCaching, withCaching, savings, savingsPercent };
}

// Example: 10k cached tokens, 1k uncached, 20 requests
const result = calculateCachingSavings(10000, 1000, 20);
console.log(`Savings: $${result.savings.toFixed(4)} (${result.savingsPercent.toFixed(1)}%)`);
```

## Official Documentation

- **Prompt Caching Guide**: https://docs.claude.com/en/docs/build-with-claude/prompt-caching
- **Pricing**: https://www.anthropic.com/pricing
- **API Reference**: https://docs.claude.com/en/api/messages

```

### references/rate-limits.md

```markdown
# Rate Limits Guide

Complete guide to Claude API rate limits and how to handle them.

## Overview

Claude API uses **token bucket algorithm** for rate limiting:
- Capacity continuously replenishes
- Three types: Requests per minute (RPM), Tokens per minute (TPM), Daily tokens
- Limits vary by account tier and model

## Rate Limit Tiers

| Tier | Requirements | Example Limits (Sonnet 4.5) |
|------|--------------|------------------------------|
| Tier 1 | New account | 50 RPM, 40k TPM |
| Tier 2 | $10 spend | 1000 RPM, 100k TPM |
| Tier 3 | $50 spend | 2000 RPM, 200k TPM |
| Tier 4 | $500 spend | 4000 RPM, 400k TPM |

**Note**: Limits vary by model. Check Console for exact limits. Opus 4.5 may have different limits than Sonnet 4.5.

## Response Headers

Every API response includes:

```
anthropic-ratelimit-requests-limit: 50
anthropic-ratelimit-requests-remaining: 49
anthropic-ratelimit-requests-reset: 2025-10-25T12:00:00Z
anthropic-ratelimit-tokens-limit: 50000
anthropic-ratelimit-tokens-remaining: 49500
anthropic-ratelimit-tokens-reset: 2025-10-25T12:01:00Z
```

On 429 errors:
```
retry-after: 60  // Seconds until retry allowed
```

## Handling Rate Limits

### Basic Exponential Backoff

```typescript
async function withRetry(requestFn, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      return await requestFn();
    } catch (error) {
      if (error.status === 429) {
        const delay = 1000 * Math.pow(2, attempt);
        await new Promise(resolve => setTimeout(resolve, delay));
      } else {
        throw error;
      }
    }
  }
}
```

### Respecting retry-after Header

```typescript
const retryAfter = error.response?.headers?.['retry-after'];
const delay = retryAfter ? parseInt(retryAfter) * 1000 : baseDelay;
```

## Best Practices

1. **Monitor headers** - Check remaining requests/tokens
2. **Implement backoff** - Exponential delay on 429
3. **Respect retry-after** - Use provided wait time
4. **Batch requests** - Group when possible
5. **Use caching** - Reduce duplicate requests
6. **Upgrade tier** - Scale with usage

## Official Docs

https://docs.claude.com/en/api/rate-limits

```

### references/tool-use-patterns.md

```markdown
# Tool Use Patterns

Common patterns for using tools (function calling) with Claude API.

## Basic Pattern

```typescript
const tools = [{
  name: 'get_weather',
  description: 'Get current weather',
  input_schema: {
    type: 'object',
    properties: {
      location: { type: 'string' }
    },
    required: ['location']
  }
}];

// 1. Send request with tools
const response = await anthropic.messages.create({
  model: 'claude-sonnet-4-5-20250929',
  max_tokens: 1024,
  tools,
  messages: [{ role: 'user', content: 'Weather in NYC?' }]
});

// 2. Check if Claude wants to use tools
if (response.stop_reason === 'tool_use') {
  for (const block of response.content) {
    if (block.type === 'tool_use') {
      // 3. Execute tool
      const result = await executeToolFunction(block.name, block.input);

      // 4. Return result
      const toolResult = {
        type: 'tool_result',
        tool_use_id: block.id,
        content: JSON.stringify(result)
      };
    }
  }
}
```

## Tool Execution Loop

```typescript
async function chatWithTools(userMessage) {
  const messages = [{ role: 'user', content: userMessage }];

  while (true) {
    const response = await anthropic.messages.create({
      model: 'claude-sonnet-4-5-20250929',
      max_tokens: 1024,
      tools,
      messages,
    });

    messages.push({ role: 'assistant', content: response.content });

    if (response.stop_reason === 'tool_use') {
      // Execute tools and continue
      const toolResults = await executeAllTools(response.content);
      messages.push({ role: 'user', content: toolResults });
    } else {
      // Final response
      return response.content.find(b => b.type === 'text')?.text;
    }
  }
}
```

## With Zod Validation

```typescript
import { betaZodTool } from '@anthropic-ai/sdk/helpers/zod';
import { z } from 'zod';

const weatherTool = betaZodTool({
  name: 'get_weather',
  inputSchema: z.object({
    location: z.string(),
    unit: z.enum(['celsius', 'fahrenheit']).optional(),
  }),
  description: 'Get weather',
  run: async (input) => {
    return `Weather in ${input.location}: 72°F`;
  },
});

// Automatic execution
const finalMessage = await anthropic.beta.messages.toolRunner({
  model: 'claude-sonnet-4-5-20250929',
  max_tokens: 1000,
  messages: [{ role: 'user', content: 'Weather in SF?' }],
  tools: [weatherTool],
});
```

## Error Handling in Tools

```typescript
{
  type: 'tool_result',
  tool_use_id: block.id,
  content: 'Error: API unavailable',
  is_error: true  // Mark as error
}
```

## Official Docs

https://docs.claude.com/en/docs/build-with-claude/tool-use

```

### references/top-errors.md

```markdown
# Top 12 Common Errors and Solutions

Complete reference for troubleshooting Claude API errors.

---

## Error #1: Rate Limit 429 - Too Many Requests

**Error Message:**
```
429 Too Many Requests: Number of request tokens has exceeded your per-minute rate limit
```

**Source**: https://docs.claude.com/en/api/errors

**Why It Happens:**
- Exceeded requests per minute (RPM)
- Exceeded tokens per minute (TPM)
- Exceeded daily token quota

**Solution:**
```typescript
async function handleRateLimit(requestFn, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      return await requestFn();
    } catch (error) {
      if (error.status === 429) {
        const retryAfter = error.response?.headers?.['retry-after'];
        const delay = retryAfter ? parseInt(retryAfter) * 1000 : 1000 * Math.pow(2, attempt);
        console.log(`Retrying in ${delay}ms...`);
        await new Promise(resolve => setTimeout(resolve, delay));
      } else {
        throw error;
      }
    }
  }
}
```

**Prevention:**
- Implement exponential backoff
- Respect `retry-after` header
- Monitor rate limit headers
- Upgrade account tier for higher limits

---

## Error #2: Streaming SSE Parsing Errors

**Error Message:**
```
Incomplete chunks, malformed events, connection dropped
```

**Source**: Community reports, SDK issues

**Why It Happens:**
- Network interruptions
- Improper SSE event parsing
- Errors occur AFTER initial 200 response

**Solution:**
```typescript
const stream = anthropic.messages.stream({...});

stream
  .on('error', (error) => {
    console.error('Stream error:', error);
    // Implement retry or fallback
  })
  .on('abort', (error) => {
    console.warn('Stream aborted');
  })
  .on('end', () => {
    console.log('Stream completed');
  });

await stream.finalMessage();
```

**Prevention:**
- Always implement error event listeners
- Handle stream abortion
- Use SDK helpers (don't parse SSE manually)
- Implement reconnection logic

---

## Error #3: Prompt Caching Not Activating

**Error Message:**
```
High costs despite cache_control blocks
cache_read_input_tokens: 0
```

**Source**: https://docs.claude.com/en/docs/build-with-claude/prompt-caching

**Why It Happens:**
- `cache_control` not on last block
- Content below minimum tokens (1024/2048)
- Content changed (breaks cache match)
- Outside 5-minute TTL

**Solution:**
```typescript
// ❌ Wrong - cache_control not at end
{
  type: 'text',
  text: DOCUMENT,
  cache_control: { type: 'ephemeral' },  // Wrong position
},
{
  type: 'text',
  text: 'Additional text',
}

// ✅ Correct - cache_control at end
{
  type: 'text',
  text: DOCUMENT + '\n\nAdditional text',
  cache_control: { type: 'ephemeral' },  // Correct position
}
```

**Prevention:**
- Place `cache_control` on LAST block
- Ensure content >= 1024 tokens
- Keep cached content identical
- Monitor `cache_read_input_tokens`

---

## Error #4: Tool Use Response Format Errors

**Error Message:**
```
invalid_request_error: tools[0].input_schema is invalid
```

**Source**: API validation

**Why It Happens:**
- Invalid JSON Schema
- Missing required fields
- Incorrect tool_use_id in tool_result

**Solution:**
```typescript
// ✅ Valid tool schema
{
  name: 'get_weather',
  description: 'Get current weather',
  input_schema: {
    type: 'object',           // Must be 'object'
    properties: {
      location: {
        type: 'string',       // Valid JSON Schema types
        description: 'City'   // Optional but recommended
      }
    },
    required: ['location']    // List required fields
  }
}

// ✅ Valid tool result
{
  type: 'tool_result',
  tool_use_id: block.id,      // Must match tool_use id
  content: JSON.stringify(result)  // Convert to string
}
```

**Prevention:**
- Validate schemas with JSON Schema validator
- Match `tool_use_id` exactly
- Stringify tool results
- Test thoroughly before production

---

## Error #5: Vision Image Format Issues

**Error Message:**
```
invalid_request_error: image source must be base64 or url
```

**Source**: API documentation

**Why It Happens:**
- Unsupported image format
- Incorrect base64 encoding
- Invalid media_type

**Solution:**
```typescript
import fs from 'fs';

const imageData = fs.readFileSync('./image.jpg');
const base64Image = imageData.toString('base64');

// ✅ Correct format
{
  type: 'image',
  source: {
    type: 'base64',
    media_type: 'image/jpeg',  // Must match actual format
    data: base64Image          // Pure base64 (no data URI prefix)
  }
}
```

**Supported Formats:**
- image/jpeg
- image/png
- image/webp
- image/gif

**Prevention:**
- Validate format before encoding
- Use correct media_type
- Remove data URI prefix if present
- Keep images under 5MB

---

## Error #6: Token Counting Mismatches

**Error Message:**
```
invalid_request_error: messages: too many tokens
```

**Source**: Token counting differences

**Why It Happens:**
- Not accounting for special tokens
- Formatting adds hidden tokens
- Context window exceeded

**Solution:**
```typescript
// Monitor token usage
const response = await anthropic.messages.create({...});

console.log('Input tokens:', response.usage.input_tokens);
console.log('Output tokens:', response.usage.output_tokens);
console.log('Total:', response.usage.input_tokens + response.usage.output_tokens);

// Check against context window
const contextWindow = 200000; // Claude Opus 4.5 / Sonnet 4.5
if (response.usage.input_tokens > contextWindow) {
  console.warn('Approaching context limit');
}
```

**Prevention:**
- Use official token counter
- Monitor usage headers
- Implement message pruning
- Use prompt caching for long context

---

## Error #7: System Prompt Ordering Issues

**Error Message:**
```
System prompt ignored or overridden
```

**Source**: API behavior

**Why It Happens:**
- System prompt placed after messages
- System prompt in wrong format

**Solution:**
```typescript
// ❌ Wrong
const message = await anthropic.messages.create({
  messages: [...],
  system: 'You are helpful',  // Wrong - after messages
});

// ✅ Correct
const message = await anthropic.messages.create({
  system: 'You are helpful',  // Correct - before messages
  messages: [...],
});
```

**Prevention:**
- Always place `system` before `messages`
- Use system prompt for behavior instructions
- Test system prompt effectiveness

---

## Error #8: Context Window Exceeded

**Error Message:**
```
invalid_request_error: messages: too many tokens (210000 > 200000)
```

**Source**: Model limits

**Why It Happens:**
- Long conversations
- Large documents
- Not pruning message history

**Solution:**
```typescript
function pruneMessages(messages, maxTokens = 150000) {
  // Keep most recent messages
  let totalTokens = 0;
  const prunedMessages = [];

  for (let i = messages.length - 1; i >= 0; i--) {
    const msgTokens = estimateTokens(messages[i].content);
    if (totalTokens + msgTokens > maxTokens) break;
    prunedMessages.unshift(messages[i]);
    totalTokens += msgTokens;
  }

  return prunedMessages;
}
```

**Prevention:**
- Implement message history pruning
- Use summarization for old messages
- Use prompt caching
- Use prompt caching for long context

---

## Error #9: Extended Thinking on Wrong Model

**Error Message:**
```
No thinking blocks in response
```

**Source**: Model capabilities

**Why It Happens:**
- Using deprecated Claude 3.x models
- Using Haiku (not supported for extended thinking)

**Solution:**
```typescript
// ❌ Deprecated models - don't use
model: 'claude-3-5-sonnet-20240620'  // Deprecated
model: 'claude-3-7-sonnet-20250228'  // Deprecated

// ✅ Correct models for extended thinking
model: 'claude-opus-4-5-20251101'    // Flagship - best for extended thinking
model: 'claude-sonnet-4-5-20250929'  // Has extended thinking
model: 'claude-opus-4-20250514'      // Has extended thinking
```

**Prevention:**
- Use Claude 4.x models only
- Opus 4.5 is best for complex reasoning
- Document model requirements

---

## Error #10: API Key Exposure in Client Code

**Error Message:**
```
CORS errors, security vulnerability
```

**Source**: Security best practices

**Why It Happens:**
- Making API calls from browser
- API key in client-side code

**Solution:**
```typescript
// ❌ Never do this
const anthropic = new Anthropic({
  apiKey: 'sk-ant-...',  // Exposed in browser!
});

// ✅ Use server-side endpoint
async function callClaude(messages) {
  const response = await fetch('/api/chat', {
    method: 'POST',
    body: JSON.stringify({ messages }),
  });
  return response.json();
}
```

**Prevention:**
- Server-side only
- Use environment variables
- Implement authentication
- Never expose API key

---

## Error #11: Rate Limit Tier Confusion

**Error Message:**
```
Lower limits than expected
```

**Source**: Account tier system

**Why It Happens:**
- Not understanding tier progression
- Expecting higher limits without usage history

**Solution:**
- Check current tier in Console
- Tiers auto-scale with usage ($10, $50, $500 spend)
- Monitor rate limit headers
- Contact support for custom limits

**Tier Progression:**
- Tier 1: 50 RPM, 40k TPM
- Tier 2: 1000 RPM, 100k TPM ($10 spend)
- Tier 3: 2000 RPM, 200k TPM ($50 spend)
- Tier 4: 4000 RPM, 400k TPM ($500 spend)

**Prevention:**
- Review tier requirements
- Plan for gradual scale-up
- Implement proper rate limiting

---

## Error #12: Message Batches Beta Headers Missing

**Error Message:**
```
invalid_request_error: unknown parameter: batches
```

**Source**: Beta API requirements

**Why It Happens:**
- Missing `anthropic-beta` header
- Using wrong endpoint

**Solution:**
```typescript
const response = await fetch('https://api.anthropic.com/v1/messages/batches', {
  method: 'POST',
  headers: {
    'x-api-key': API_KEY,
    'anthropic-version': '2023-06-01',
    'anthropic-beta': 'message-batches-2024-09-24',  // Required!
    'content-type': 'application/json',
  },
  body: JSON.stringify({...}),
});
```

**Prevention:**
- Include beta headers for beta features
- Check official docs for header requirements
- Test beta features in development first

---

## Quick Diagnosis Checklist

When encountering errors:

1. ✅ Check error status code (400, 401, 429, 500, etc.)
2. ✅ Read error message carefully
3. ✅ Verify API key is valid and in environment variable
4. ✅ Confirm model ID is correct
5. ✅ Check request format matches API spec
6. ✅ Monitor rate limit headers
7. ✅ Review recent code changes
8. ✅ Test with minimal example
9. ✅ Check official docs for breaking changes
10. ✅ Search GitHub issues for similar problems

---

## Getting Help

**Official Resources:**
- **Errors Reference**: https://docs.claude.com/en/api/errors
- **API Documentation**: https://docs.claude.com/en/api
- **Support**: https://support.claude.com/

**Community:**
- **GitHub Issues**: https://github.com/anthropics/anthropic-sdk-typescript/issues
- **Developer Forum**: https://support.claude.com/

**This Skill:**
- Check other reference files for detailed guides
- Review templates for working examples
- Verify setup checklist in SKILL.md

```

### references/vision-capabilities.md

```markdown
# Vision Capabilities

Guide to using Claude's vision capabilities for image understanding.

## Supported Formats

- **JPEG** (image/jpeg)
- **PNG** (image/png)
- **WebP** (image/webp)
- **GIF** (image/gif) - non-animated only

**Max size**: 5MB per image

## Basic Usage

```typescript
import fs from 'fs';

const imageData = fs.readFileSync('./image.jpg').toString('base64');

const message = await anthropic.messages.create({
  model: 'claude-sonnet-4-5-20250929',
  max_tokens: 1024,
  messages: [{
    role: 'user',
    content: [
      {
        type: 'image',
        source: {
          type: 'base64',
          media_type: 'image/jpeg',
          data: imageData
        }
      },
      {
        type: 'text',
        text: 'What is in this image?'
      }
    ]
  }]
});
```

## Multiple Images

```typescript
content: [
  { type: 'text', text: 'Compare these images:' },
  { type: 'image', source: { type: 'base64', media_type: 'image/jpeg', data: img1 } },
  { type: 'image', source: { type: 'base64', media_type: 'image/png', data: img2 } },
  { type: 'text', text: 'What are the differences?' }
]
```

## With Tools

```typescript
const message = await anthropic.messages.create({
  model: 'claude-sonnet-4-5-20250929',
  max_tokens: 1024,
  tools: [searchProductTool],
  messages: [{
    role: 'user',
    content: [
      { type: 'image', source: {...} },
      { type: 'text', text: 'Find similar products' }
    ]
  }]
});
```

## With Prompt Caching

```typescript
{
  type: 'image',
  source: {
    type: 'base64',
    media_type: 'image/jpeg',
    data: imageData
  },
  cache_control: { type: 'ephemeral' }  // Cache image
}
```

## Validation

```typescript
function validateImage(path: string) {
  const stats = fs.statSync(path);
  const sizeMB = stats.size / (1024 * 1024);

  if (sizeMB > 5) {
    throw new Error('Image exceeds 5MB');
  }

  const ext = path.split('.').pop()?.toLowerCase();
  if (!['jpg', 'jpeg', 'png', 'webp', 'gif'].includes(ext)) {
    throw new Error('Unsupported format');
  }

  return true;
}
```

## Official Docs

https://docs.claude.com/en/docs/build-with-claude/vision

```

### scripts/check-versions.sh

```bash
#!/bin/bash

# Check Claude API package versions
# Usage: ./scripts/check-versions.sh

echo "=== Claude API Package Version Checker ==="
echo ""

# Colors
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
RED='\033[0;31m'
NC='\033[0m' # No Color

# Check if npm is installed
if ! command -v npm &> /dev/null; then
    echo -e "${RED}Error: npm is not installed${NC}"
    exit 1
fi

# Check @anthropic-ai/sdk
echo "Checking @anthropic-ai/sdk..."
CURRENT_VERSION=$(npm view @anthropic-ai/sdk version 2>/dev/null)

if [ -z "$CURRENT_VERSION" ]; then
    echo -e "${RED}Error: Could not fetch version info${NC}"
else
    echo -e "${GREEN}Latest version: $CURRENT_VERSION${NC}"

    # Expected version from skill
    EXPECTED="0.67.0"

    if [ "$CURRENT_VERSION" != "$EXPECTED" ]; then
        echo -e "${YELLOW}Note: Skill was verified with version $EXPECTED${NC}"
        echo -e "${YELLOW}Latest version is $CURRENT_VERSION${NC}"
        echo ""
        echo "To update skill documentation, run:"
        echo "  npm view @anthropic-ai/sdk version"
    else
        echo -e "${GREEN}✓ Version matches skill documentation${NC}"
    fi
fi

echo ""

# Check other related packages
echo "Checking optional dependencies..."
echo ""

packages=("zod" "@types/node" "typescript")

for package in "${packages[@]}"; do
    echo "- $package:"
    VERSION=$(npm view $package version 2>/dev/null)
    if [ -z "$VERSION" ]; then
        echo -e "  ${YELLOW}Could not fetch version${NC}"
    else
        echo -e "  ${GREEN}Latest: $VERSION${NC}"
    fi
done

echo ""

# Check if package.json exists locally
if [ -f "package.json" ]; then
    echo "Checking local package.json..."
    echo ""

    if command -v jq &> /dev/null; then
        # Use jq if available
        ANTHROPIC_VERSION=$(jq -r '.dependencies."@anthropic-ai/sdk"' package.json 2>/dev/null)

        if [ "$ANTHROPIC_VERSION" != "null" ] && [ ! -z "$ANTHROPIC_VERSION" ]; then
            echo "@anthropic-ai/sdk: $ANTHROPIC_VERSION"
        fi
    else
        # Fallback to grep
        grep -E '"@anthropic-ai/sdk"' package.json
    fi

    echo ""
fi

# Check for breaking changes
echo "=== Checking for breaking changes ==="
echo ""
echo "Official changelog:"
echo "https://github.com/anthropics/anthropic-sdk-typescript/releases"
echo ""

# Check npm for recent updates
echo "Recent versions:"
npm view @anthropic-ai/sdk versions --json | tail -10

echo ""
echo "=== Version Check Complete ==="
echo ""
echo "To update your local installation:"
echo "  npm install @anthropic-ai/sdk@latest"
echo ""
echo "To check what would be installed:"
echo "  npm outdated"

```