SkillHub ClubAnalyze Data & AIDevOpsData / AI

Cloudflare Workers & Edge AI Development

Build ultra-low-latency edge computing applications with Cloudflare Workers, Workers AI for LLM inference, V8 isolates, Durable Objects, and serverless patterns deployed across 330+ data centers worldwide

Packaged view

This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.

Stars

Hot score

Updated

March 20, 2026

Overall rating

C1.9

Composite score

1.9

Best-practice grade

C57.6

Install command

npx @skill-hub/cli install bobmatnyc-mcp-skillset-cloudflare-edge-ai

cloudflareedge-computingserverlessworkersworkers-aiv8-isolatesdurable-objectsedge-ailow-latencywrangler

Repository

bobmatnyc/mcp-skillset

Skill path: docs/skill-templates/cloudflare-edge-ai

Open repository

Best for

Primary workflow: Analyze Data & AI.

Technical facets: DevOps, Data / AI.

Target audience: everyone.

License: MIT.

Original source

Catalog source: SkillHub Club.

Repository owner: bobmatnyc.

This is still a mirrored public skill entry. Review the repository before installing into production workflows.

What it helps with

Install Cloudflare Workers & Edge AI Development into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
Review https://github.com/bobmatnyc/mcp-skillset before adding Cloudflare Workers & Edge AI Development to shared team environments
Use Cloudflare Workers & Edge AI Development for cloud & serverless workflows

Works across

Claude CodeCodex CLIGemini CLIOpenCode

Favorites: 0.

Sub-skills: 0.

Aggregator: No.

Original source / Raw SKILL.md

---
name: Cloudflare Workers & Edge AI Development
skill_id: cloudflare-edge-ai
version: 1.0.0
description: Build ultra-low-latency edge computing applications with Cloudflare Workers, Workers AI for LLM inference, V8 isolates, Durable Objects, and serverless patterns deployed across 330+ data centers worldwide
category: Cloud & Serverless
tags:
  - cloudflare
  - edge-computing
  - serverless
  - workers
  - workers-ai
  - v8-isolates
  - durable-objects
  - edge-ai
  - low-latency
  - wrangler
author: mcp-skillset
license: MIT
created: 2025-11-25
last_updated: 2025-11-25
toolchain:
  - Cloudflare Workers
  - Wrangler CLI
  - Workers AI
  - TypeScript
frameworks:
  - Cloudflare Workers
  - Hono (lightweight framework)
  - Vectorize (vector DB)
related_skills:
  - fastapi-web-development
  - terraform-infrastructure
  - web3-blockchain
---

# Cloudflare Workers & Edge AI Development

## Overview

Master Cloudflare Workers - the fastest serverless platform with <1ms cold starts and Workers AI for running LLMs at the edge. Deploy across 330+ data centers globally for ultra-low-latency applications that run close to your users.

## When to Use This Skill

- Building globally distributed APIs with <50ms latency
- Running AI/LLM inference at the edge (Workers AI)
- Creating serverless backends without managing infrastructure
- Implementing edge middleware (auth, rate limiting, A/B testing)
- Building real-time collaborative applications (Durable Objects)
- Processing high-traffic workloads cost-effectively
- Deploying static sites with dynamic edge logic

## Core Principles

### 1. V8 Isolates (Not Containers)

**Workers run in V8 isolates - much faster than containers**

```typescript
// Basic Worker structure
export default {
  async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise<Response> {
    // ✅ <1ms cold start (V8 isolate, not container!)
    // ✅ Runs in 330+ locations automatically
    // ✅ 0ms warm start (keeps isolate alive)

    return new Response("Hello from the edge!", {
      headers: { "Content-Type": "text/plain" }
    });
  }
};

// WRONG: Don't use Node.js APIs (not available)
// import fs from 'fs';  // ❌ No filesystem
// process.env.VAR;      // ❌ No process object

// CORRECT: Use Workers APIs
const value = env.MY_KV_NAMESPACE.get("key");  // ✅ KV storage
const response = await fetch("https://api.example.com");  // ✅ fetch API
```

**Key Differences from Node.js/Lambda**:
- ❌ No filesystem access
- ❌ No Node.js built-ins (fs, http, crypto from Node)
- ✅ Web Standard APIs (fetch, Request, Response, WebSockets)
- ✅ Sub-millisecond cold starts
- ✅ No VPC configuration needed

### 2. Workers AI - LLM Inference at the Edge

```typescript
// Run LLaMA 2, Mistral, or other models at the edge
export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const { prompt } = await request.json();

    // Text generation with LLaMA 2
    const response = await env.AI.run("@cf/meta/llama-2-7b-chat-int8", {
      messages: [
        { role: "system", content: "You are a helpful assistant" },
        { role: "user", content: prompt }
      ]
    });

    return Response.json(response);
  }
};

// Image generation
const image = await env.AI.run("@cf/stabilityai/stable-diffusion-xl-base-1.0", {
  prompt: "A futuristic city at sunset"
});

// Text embeddings (for vector search)
const embeddings = await env.AI.run("@cf/baai/bge-base-en-v1.5", {
  text: "Hello, world!"
});

// Image classification
const result = await env.AI.run("@cf/microsoft/resnet-50", {
  image: imageBytes
});
```

**Workers AI Benefits**:
- 🚀 <50ms inference latency globally
- 💰 Pay only for inference time (no idle costs)
- 🌍 Runs in 330+ locations automatically
- 🔒 Data never leaves Cloudflare's network

### 3. KV Storage for Edge Data

```typescript
// wrangler.toml
# kv_namespaces = [
#   { binding = "MY_KV", id = "xxxxx", preview_id = "yyyyy" }
# ]

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const url = new URL(request.url);
    const key = url.pathname.slice(1);

    // Read from KV (eventually consistent)
    const value = await env.MY_KV.get(key);

    if (value === null) {
      return new Response("Not found", { status: 404 });
    }

    // Write to KV
    await env.MY_KV.put("user:123", JSON.stringify({ name: "Alice" }), {
      expirationTtl: 3600,  // Expire in 1 hour
      metadata: { createdAt: Date.now() }
    });

    // List keys
    const list = await env.MY_KV.list({ prefix: "user:" });

    // Delete key
    await env.MY_KV.delete(key);

    return new Response(value);
  }
};
```

**KV Best Practices**:
- ✅ Use for read-heavy workloads (caching, config)
- ✅ Values up to 25 MB
- ✅ Eventually consistent (may take 60s to propagate)
- ❌ Not for transactional data or immediate consistency

### 4. Durable Objects for Stateful Logic

```typescript
// Durable Object - Single-instance, strongly consistent storage
export class RateLimiter {
  state: DurableObjectState;

  constructor(state: DurableObjectState, env: Env) {
    this.state = state;
  }

  async fetch(request: Request) {
    const count = (await this.state.storage.get("count")) || 0;
    const newCount = count + 1;

    if (newCount > 100) {
      return new Response("Rate limit exceeded", { status: 429 });
    }

    await this.state.storage.put("count", newCount);
    return new Response(`Request ${newCount}/100`);
  }
}

// Worker that uses Durable Object
export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    // Get Durable Object instance (one per user)
    const id = env.RATE_LIMITER.idFromName("user:123");
    const stub = env.RATE_LIMITER.get(id);

    // Forward request to Durable Object
    return stub.fetch(request);
  }
};

// wrangler.toml
# [[durable_objects.bindings]]
# name = "RATE_LIMITER"
# class_name = "RateLimiter"
# script_name = "my-worker"
```

**Durable Objects Use Cases**:
- ✅ Real-time collaboration (Google Docs-style)
- ✅ WebSocket connections
- ✅ Distributed locks and coordination
- ✅ Game servers, chat rooms
- ✅ Strongly consistent counters

### 5. Vectorize for Edge Vector Search

```typescript
// Vector database at the edge
export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const { text } = await request.json();

    // Generate embedding with Workers AI
    const embedding = await env.AI.run("@cf/baai/bge-base-en-v1.5", {
      text: text
    });

    // Insert into vector database
    await env.VECTORIZE.insert([
      {
        id: "doc-1",
        values: embedding.data[0],
        metadata: { text: text }
      }
    ]);

    // Query similar vectors
    const results = await env.VECTORIZE.query(embedding.data[0], {
      topK: 5,
      returnMetadata: true
    });

    return Response.json(results);
  }
};

// wrangler.toml
# [[vectorize]]
# binding = "VECTORIZE"
# index_name = "my-index"
```

## Best Practices

### Request Routing & Middleware

```typescript
import { Hono } from 'hono';
import { cors } from 'hono/cors';
import { bearerAuth } from 'hono/bearer-auth';

const app = new Hono<{ Bindings: Env }>();

// CORS middleware
app.use('/*', cors({
  origin: ['https://example.com'],
  allowMethods: ['GET', 'POST', 'PUT', 'DELETE'],
}));

// Authentication middleware
app.use('/api/*', bearerAuth({ token: 'secret-token' }));

// Rate limiting
app.use('/api/*', async (c, next) => {
  const ip = c.req.header('cf-connecting-ip');
  const id = c.env.RATE_LIMITER.idFromName(ip);
  const limiter = c.env.RATE_LIMITER.get(id);

  const response = await limiter.fetch(c.req.raw);
  if (response.status === 429) {
    return c.text('Rate limit exceeded', 429);
  }

  await next();
});

// Routes
app.get('/api/users/:id', async (c) => {
  const userId = c.req.param('id');
  const user = await c.env.MY_KV.get(`user:${userId}`);
  return c.json(JSON.parse(user));
});

app.post('/api/chat', async (c) => {
  const { message } = await c.req.json();

  const response = await c.env.AI.run("@cf/meta/llama-2-7b-chat-int8", {
    messages: [{ role: "user", content: message }]
  });

  return c.json(response);
});

export default app;
```

### Caching Strategies

```typescript
export default {
  async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise<Response> {
    const cacheUrl = new URL(request.url);
    const cacheKey = new Request(cacheUrl.toString(), request);
    const cache = caches.default;

    // Check cache first
    let response = await cache.match(cacheKey);

    if (!response) {
      // Cache miss - fetch from origin
      response = await fetch(request);

      // Clone response to cache it
      response = new Response(response.body, response);
      response.headers.set("Cache-Control", "public, max-age=3600");

      // Don't await cache.put (let it run in background)
      ctx.waitUntil(cache.put(cacheKey, response.clone()));
    }

    return response;
  }
};
```

### Environment Variables & Secrets

```typescript
// wrangler.toml
# [vars]
# API_URL = "https://api.example.com"
#
# Run: wrangler secret put API_KEY
# (for sensitive values)

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const apiUrl = env.API_URL;       // From [vars]
    const apiKey = env.API_KEY;       // From secrets

    const response = await fetch(`${apiUrl}/data`, {
      headers: { "Authorization": `Bearer ${apiKey}` }
    });

    return response;
  }
};
```

## Common Patterns

### A/B Testing at the Edge

```typescript
export default {
  async fetch(request: Request): Promise<Response> {
    const cookie = request.headers.get('cookie') || '';
    const variant = cookie.includes('variant=b') ? 'B' : 'A';

    // Or use hash of IP for consistent assignment
    // const ip = request.headers.get('cf-connecting-ip');
    // const variant = hashIP(ip) % 2 === 0 ? 'A' : 'B';

    if (variant === 'B') {
      return fetch('https://variant-b.example.com');
    }

    return fetch('https://variant-a.example.com');
  }
};
```

### Geolocation-Based Routing

```typescript
export default {
  async fetch(request: Request): Promise<Response> {
    const country = request.cf?.country;

    if (country === 'US') {
      return fetch('https://us.api.example.com');
    } else if (country === 'EU') {
      return fetch('https://eu.api.example.com');
    }

    return fetch('https://global.api.example.com');
  }
};
```

## Anti-Patterns

### ❌ DON'T: Block the response

```typescript
// BAD: Slow, synchronous loop
export default {
  async fetch(request: Request): Promise<Response> {
    for (let i = 0; i < 1000000; i++) {
      // Blocks isolate!
    }
    return new Response("Done");
  }
};

// GOOD: Use ctx.waitUntil for background work
export default {
  async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise<Response> {
    ctx.waitUntil(
      (async () => {
        // Background work (logging, analytics)
        await env.MY_KV.put("last-request", Date.now());
      })()
    );

    return new Response("Done");
  }
};
```

### ❌ DON'T: Use Workers for long-running tasks

```
Workers CPU limit: 10ms (free), 50ms (paid), 30s (unbound)

✅ Good for: API requests, edge logic, AI inference
❌ Bad for: Video encoding, large file processing, long ML training
```

## Testing & Development

```bash
# Install Wrangler CLI
npm install -g wrangler

# Create new project
wrangler init my-worker

# Local development (with Workers AI, KV, DO)
wrangler dev

# Deploy to production
wrangler deploy

# Tail logs
wrangler tail

# Run tests with Vitest
npm test
```

## Related Skills

- **fastapi-web-development**: Compare serverless vs traditional APIs
- **terraform-infrastructure**: Deploy Workers with Terraform
- **web3-blockchain**: Build Web3 apps on Cloudflare

## Additional Resources

- Workers Documentation: https://developers.cloudflare.com/workers
- Workers AI Models: https://developers.cloudflare.com/workers-ai/models
- Durable Objects: https://developers.cloudflare.com/durable-objects
- Examples: https://github.com/cloudflare/workers-sdk/tree/main/templates

## Example Questions

- "How do I run LLaMA 2 at the edge with Workers AI?"
- "Show me how to implement rate limiting with Durable Objects"
- "What's the difference between KV and Durable Objects?"
- "How do I cache API responses at the edge?"
- "Write a Worker that does geolocation-based routing"