Back to skills
SkillHub ClubShip Full StackFull StackBackend

gemini-image-simple

Generate and edit images with Gemini API using pure Python stdlib. Zero dependencies - works on locked-down environments where pip/uv aren't available.

Packaged view

This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.

Stars
3,121
Hot score
99
Updated
March 19, 2026
Overall rating
C4.0
Composite score
4.0
Best-practice grade
B84.0

Install command

npx @skill-hub/cli install openclaw-skills-gemini-image-simple

Repository

openclaw/skills

Skill path: skills/cluka-399/gemini-image-simple

Generate and edit images with Gemini API using pure Python stdlib. Zero dependencies - works on locked-down environments where pip/uv aren't available.

Open repository

Best for

Primary workflow: Ship Full Stack.

Technical facets: Full Stack, Backend.

Target audience: everyone.

License: Unknown.

Original source

Catalog source: SkillHub Club.

Repository owner: openclaw.

This is still a mirrored public skill entry. Review the repository before installing into production workflows.

What it helps with

  • Install gemini-image-simple into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
  • Review https://github.com/openclaw/skills before adding gemini-image-simple to shared team environments
  • Use gemini-image-simple for development workflows

Works across

Claude CodeCodex CLIGemini CLIOpenCode

Favorites: 0.

Sub-skills: 0.

Aggregator: No.

Original source / Raw SKILL.md

---
name: gemini-image-simple
version: 1.1.0
description: Generate and edit images with Gemini API using pure Python stdlib. Zero dependencies - works on locked-down environments where pip/uv aren't available.
metadata:
  openclaw:
    emoji: "🎨"
    requires:
      env: ["GEMINI_API_KEY"]
---

# Gemini Image Simple

Generate and edit images using Google's **Nano Banana Pro** (Gemini 3 Pro Image) - the highest quality image generation model.

## Why This Skill

| Feature | This Skill | Others (nano-banana-pro, etc.) |
|---------|------------|-------------------------------|
| **Dependencies** | None (stdlib only) | google-genai, pillow, etc. |
| **Requires pip/uv** | ❌ No | βœ… Yes |
| **Works on Fly.io free** | βœ… Yes | ❌ Fails |
| **Works in containers** | βœ… Yes | ❌ Often fails |
| **Image generation** | βœ… Full | βœ… Full |
| **Image editing** | βœ… Yes | βœ… Yes |
| **Setup complexity** | Just set API key | Install packages first |

**Bottom line:** This skill works anywhere Python 3 exists. No package managers, no virtual environments, no permission issues.

## Quick Start

```bash
# Generate
python3 /data/clawd/skills/gemini-image-simple/scripts/generate.py "A cat wearing a tiny hat" cat.png

# Edit existing image  
python3 /data/clawd/skills/gemini-image-simple/scripts/generate.py "Make it sunset lighting" edited.png --input original.png
```

## Usage

### Generate new image

```bash
python3 {baseDir}/scripts/generate.py "your prompt" output.png
```

### Edit existing image

```bash
python3 {baseDir}/scripts/generate.py "edit instructions" output.png --input source.png
```

Supported input formats: PNG, JPG, JPEG, GIF, WEBP

## Environment

Set `GEMINI_API_KEY` environment variable. Get one at https://aistudio.google.com/apikey

## How It Works

Uses **Nano Banana Pro** (`nano-banana-pro-preview`) - Google's highest quality image generation model:
- Pure `urllib.request` for HTTP (no requests library)
- Pure `json` for parsing (stdlib)
- Pure `base64` for encoding (stdlib)

That's it. No external packages. Works on any Python 3.10+ installation.

## Model

Currently using: `nano-banana-pro-preview` (also known as Gemini 3 Pro Image)

Other available models (can be changed in generate.py if needed):
- `gemini-3-pro-image-preview` - Same as Nano Banana Pro
- `imagen-4.0-ultra-generate-001` - Imagen 4.0 Ultra
- `imagen-4.0-generate-001` - Imagen 4.0
- `gemini-2.5-flash-image` - Gemini 2.5 Flash with image gen

## Examples

```bash
# Landscape
python3 {baseDir}/scripts/generate.py "Misty mountains at sunrise, photorealistic" mountains.png

# Product shot
python3 {baseDir}/scripts/generate.py "Minimalist product photo of a coffee cup, white background" coffee.png

# Edit: change style
python3 {baseDir}/scripts/generate.py "Convert to watercolor painting style" watercolor.png --input photo.jpg

# Edit: add element
python3 {baseDir}/scripts/generate.py "Add a rainbow in the sky" rainbow.png --input landscape.png
```


---

## Skill Companion Files

> Additional files collected from the skill directory layout.

### _meta.json

```json
{
  "owner": "cluka-399",
  "slug": "gemini-image-simple",
  "displayName": "Gemini Image Simple",
  "latest": {
    "version": "1.1.0",
    "publishedAt": 1769977821317,
    "commit": "https://github.com/clawdbot/skills/commit/dc0292c82a55a89c30426fa6550f9edd8e5bc177"
  },
  "history": [
    {
      "version": "1.0.0",
      "publishedAt": 1769804785662,
      "commit": "https://github.com/clawdbot/skills/commit/4e4d156bb4b2ee284337e16bc409bac17b3f34f5"
    }
  ]
}

```

### scripts/generate.py

```python
#!/usr/bin/env python3
"""
Gemini Image Generation - Pure Python stdlib, no dependencies.

Usage:
    python3 generate.py "prompt" output.png
    python3 generate.py "edit instructions" output.png --input original.png

Requires GEMINI_API_KEY environment variable.
"""

import os
import sys
import json
import base64
import urllib.request
import urllib.error
from pathlib import Path


def get_api_key():
    """Get API key from environment."""
    key = os.environ.get("GEMINI_API_KEY")
    if not key:
        print("Error: GEMINI_API_KEY environment variable not set", file=sys.stderr)
        sys.exit(1)
    return key


def load_image_as_base64(path):
    """Load an image file and return base64-encoded string."""
    with open(path, "rb") as f:
        return base64.b64encode(f.read()).decode()


def detect_mime_type(path):
    """Detect MIME type from file extension."""
    ext = Path(path).suffix.lower()
    mime_types = {
        ".png": "image/png",
        ".jpg": "image/jpeg",
        ".jpeg": "image/jpeg",
        ".gif": "image/gif",
        ".webp": "image/webp",
    }
    return mime_types.get(ext, "image/png")


def generate_image(prompt, output_path, input_image_path=None):
    """Generate or edit an image using Gemini API."""
    api_key = get_api_key()
    
    url = f"https://generativelanguage.googleapis.com/v1beta/models/nano-banana-pro-preview:generateContent?key={api_key}"
    
    # Build request parts
    parts = [{"text": prompt}]
    
    # Add input image if provided (for editing)
    if input_image_path:
        if not os.path.exists(input_image_path):
            print(f"Error: Input image not found: {input_image_path}", file=sys.stderr)
            sys.exit(1)
        
        img_data = load_image_as_base64(input_image_path)
        mime_type = detect_mime_type(input_image_path)
        parts.append({
            "inlineData": {
                "mimeType": mime_type,
                "data": img_data
            }
        })
    
    payload = {
        "contents": [{"parts": parts}],
        "generationConfig": {
            "responseModalities": ["TEXT", "IMAGE"]
        }
    }
    
    headers = {"Content-Type": "application/json"}
    data = json.dumps(payload).encode()
    
    req = urllib.request.Request(url, data=data, headers=headers)
    
    try:
        with urllib.request.urlopen(req, timeout=180) as resp:
            result = json.loads(resp.read().decode())
    except urllib.error.HTTPError as e:
        error_body = e.read().decode()
        print(f"HTTP Error {e.code}: {error_body}", file=sys.stderr)
        sys.exit(1)
    except urllib.error.URLError as e:
        print(f"URL Error: {e.reason}", file=sys.stderr)
        sys.exit(1)
    
    # Extract image from response
    try:
        candidates = result.get("candidates", [])
        if not candidates:
            print("Error: No candidates in response", file=sys.stderr)
            print(json.dumps(result, indent=2), file=sys.stderr)
            sys.exit(1)
        
        content = candidates[0].get("content", {})
        parts = content.get("parts", [])
        
        for part in parts:
            if "inlineData" in part:
                img_data = base64.b64decode(part["inlineData"]["data"])
                
                # Ensure output directory exists
                output_dir = Path(output_path).parent
                if output_dir and not output_dir.exists():
                    output_dir.mkdir(parents=True, exist_ok=True)
                
                with open(output_path, "wb") as f:
                    f.write(img_data)
                
                print(f"Saved: {output_path}")
                return output_path
        
        print("Error: No image data in response", file=sys.stderr)
        print(json.dumps(result, indent=2), file=sys.stderr)
        sys.exit(1)
        
    except (KeyError, IndexError) as e:
        print(f"Error parsing response: {e}", file=sys.stderr)
        print(json.dumps(result, indent=2), file=sys.stderr)
        sys.exit(1)


def main():
    import argparse
    
    parser = argparse.ArgumentParser(
        description="Generate or edit images using Gemini API (pure stdlib, no dependencies)"
    )
    parser.add_argument("prompt", help="Image prompt or edit instructions")
    parser.add_argument("output", help="Output file path (e.g., output.png)")
    parser.add_argument("--input", "-i", help="Input image for editing (optional)")
    
    args = parser.parse_args()
    
    generate_image(args.prompt, args.output, args.input)


if __name__ == "__main__":
    main()

```

gemini-image-simple | SkillHub