gemini-image-gen
Generate images using Google's Gemini API — hero backgrounds, OG images, placeholder photos, textures, and style-matched variants. Uses free-tier models for drafts, paid for finals. No dependencies beyond Python 3. Trigger with 'generate image', 'gemini image', 'make a hero background', 'create placeholder photo', 'generate OG image', 'AI image', or 'need an image for'.
Packaged view
This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.
Install command
npx @skill-hub/cli install jezweb-claude-skills-gemini-image-gen
Repository
Skill path: plugins/design-assets/skills/gemini-image-gen
Generate images using Google's Gemini API — hero backgrounds, OG images, placeholder photos, textures, and style-matched variants. Uses free-tier models for drafts, paid for finals. No dependencies beyond Python 3. Trigger with 'generate image', 'gemini image', 'make a hero background', 'create placeholder photo', 'generate OG image', 'AI image', or 'need an image for'.
Open repositoryBest for
Primary workflow: Analyze Data & AI.
Technical facets: Full Stack, Backend, Data / AI.
Target audience: everyone.
License: Unknown.
Original source
Catalog source: SkillHub Club.
Repository owner: jezweb.
This is still a mirrored public skill entry. Review the repository before installing into production workflows.
What it helps with
- Install gemini-image-gen into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
- Review https://github.com/jezweb/claude-skills before adding gemini-image-gen to shared team environments
- Use gemini-image-gen for development workflows
Works across
Favorites: 0.
Sub-skills: 0.
Aggregator: No.
Original source / Raw SKILL.md
---
name: gemini-image-gen
description: "Generate images using Google's Gemini API — hero backgrounds, OG images, placeholder photos, textures, and style-matched variants. Uses free-tier models for drafts, paid for finals. No dependencies beyond Python 3. Trigger with 'generate image', 'gemini image', 'make a hero background', 'create placeholder photo', 'generate OG image', 'AI image', or 'need an image for'."
compatibility: claude-code-only
---
# Gemini Image Generator
Generate contextual images for web projects using the Gemini API. Produces hero backgrounds, OG cards, placeholder photos, textures, and style-matched variants.
## Setup
**API Key**: Set `GEMINI_API_KEY` as an environment variable. Get a key from https://aistudio.google.com/apikey if you don't have one.
```bash
export GEMINI_API_KEY="your-key-here"
```
## Workflow
### Step 1: Understand What's Needed
Gather from the user or project context:
- **What**: hero background, product photo, texture, OG image, placeholder
- **Style**: warm/cool/minimal/luxurious/bold — check project's colour palette (input.css, tailwind config)
- **Dimensions**: hero (1920x1080), OG (1200x630), square (1024x1024), custom
- **Count**: single image or multiple variants to choose from
### Step 2: Build the Prompt
Use concrete photography parameters, not abstract adjectives. Read [references/prompting-guide.md](references/prompting-guide.md) for the full framework.
**Quick rules**:
- Narrate like directing a photographer
- Use camera specs: "85mm f/1.8", "wide angle 24mm"
- Use colour anchors from the project palette: "warm terracotta (#C66A52) and cream (#F5F0EB) tones"
- Use lighting descriptions: "golden-hour light from the left, 4500K"
- Always end with: "No text, no watermarks, no logos, no hands"
### Step 3: Generate
Generate a Python script (no dependencies beyond stdlib) that calls the Gemini API. The script should:
1. Read `GEMINI_API_KEY` from environment
2. POST to `https://generativelanguage.googleapis.com/v1beta/models/{model}:generateContent`
3. Include `"responseModalities": ["TEXT", "IMAGE"]` in generationConfig
4. Parse the response: extract `inlineData.data` (base64) from candidate parts
5. Decode base64 and save as PNG
6. Support multiple variants (generate N times, save as `name-1.png`, `name-2.png`)
For style matching with a reference image, include the reference as an `inlineData` part before the text prompt, and use temperature 0.7 (instead of 1.0).
See [references/api-pattern.md](references/api-pattern.md) for the full implementation pattern including error handling and response parsing.
**Critical**: Never pass prompts via curl + bash arguments — shell escaping breaks on apostrophes. Always use Python's `json.dumps()` or write the prompt to a file first.
### Step 4: Post-Process (Optional)
Use the **image-processing** skill for resizing, format conversion, or optimisation.
### Step 5: Present to User
Show the generated images for review. Read the image files to display them inline if possible, otherwise describe what was generated and let the user open them.
## Presets
Starting prompts — enhance with project-specific context (colours, mood, subject):
| Preset | Base Prompt |
|--------|-------------|
| `hero-background` | "Wide atmospheric background, soft-focus, [colour tones], [mood], landscape 1920x1080" |
| `og-image` | "Clean branded card background, [brand colours], subtle gradient, 1200x630" |
| `placeholder-photo` | "Professional stock-style photo of [subject], natural lighting, warm tones" |
| `texture-pattern` | "Subtle repeating texture, [material], seamless tile, muted [colour]" |
| `product-shot` | "Product photography, [item] on [surface], soft studio lighting, clean background" |
## Model Selection
| Use case | Model | Cost |
|----------|-------|------|
| Drafts, quick placeholders | `gemini-2.5-flash-image` | Free (~500/day) |
| Final client assets | `gemini-3-pro-image-preview` | ~$0.04/image |
| Style-matched variants | `gemini-3-pro-image-preview` + reference image | ~$0.04/image |
Verify current model IDs if errors occur — they change frequently.
## Reference Files
| When | Read |
|------|------|
| Building effective prompts | [references/prompting-guide.md](references/prompting-guide.md) |
| API implementation details | [references/api-pattern.md](references/api-pattern.md) |
---
## Referenced Files
> The following files are referenced in this skill and included for context.
### references/prompting-guide.md
```markdown
# Gemini Image Prompting Guide
## The 5-Part Prompt Framework
Build prompts using these five components in order:
### 1. Scene Setup
What's in the image — subject, setting, composition.
```
"A woman receiving a facial treatment in a luxury day spa"
"A modern office workspace with plants and warm lighting"
"An aerial view of a coastal town at sunset"
```
### 2. Camera & Lens
Concrete photography parameters control the look more reliably than adjectives.
| Parameter | Effect | Example |
|-----------|--------|---------|
| Focal length | Compression/perspective | "85mm" (portrait), "24mm" (wide), "135mm" (compressed) |
| Aperture | Depth of field | "f/1.8" (blurry bg), "f/8" (sharp throughout) |
| Angle | Perspective | "eye level", "overhead flat lay", "low angle looking up" |
| Distance | Framing | "close-up", "medium shot", "wide establishing shot" |
```
"Shot at 85mm f/1.8, shallow depth of field, medium shot from eye level"
```
### 3. Lighting
Describe the light source, quality, and colour temperature.
| Instead of... | Use... |
|---------------|--------|
| "beautiful lighting" | "warm golden-hour light from the left, 4500K" |
| "professional lighting" | "soft diffused window light, slight rim light from behind" |
| "moody lighting" | "low-key dramatic side lighting, deep shadows, single source" |
```
"Warm directional light from a large window on the right, soft shadows, 4000K colour temperature"
```
### 4. Colour Palette
Anchor to specific colours from the project. Use hex codes or descriptive anchors.
```
"Warm terracotta (#C66A52) and cream (#F5F0EB) tones throughout"
"Cool slate blue and white palette, desaturated"
"Rich emerald green and gold accents"
```
**Pull from the project**: check `input.css`, `tailwind.config`, or the colour-palette skill output for exact values.
### 5. Negative Constraints
What to exclude. Always include these:
```
"No text, no watermarks, no logos, no hands, no fingers"
```
Add context-specific negatives:
```
"No people" (for abstract backgrounds)
"No artificial elements" (for nature shots)
"No cluttered background" (for product shots)
```
## Complete Example
```
A woman receiving a gentle facial treatment in a luxury day spa,
warm golden-hour light streaming through sheer curtains from the left,
shot at 85mm f/2.0 with shallow depth of field,
warm terracotta and cream colour palette,
soft bokeh in the background showing spa interior,
photorealistic, natural skin texture,
no text, no watermarks, no logos, no hands visible
```
## Style Matching with Reference Images
When using `--reference` to match an existing image's style:
1. **Be specific about what to change** (subject, framing, setting)
2. **Let the model infer what to keep** (lighting, colour palette, mood)
3. **Lower temperature** (0.7) stays closer to the reference
```
"Using the same warm lighting, colour palette, and photographic style as the reference image,
generate a close-up of hands performing a massage treatment on a spa table.
Maintain the same soft-focus background treatment and golden tones."
```
## Common Failure Modes
| Issue | Fix |
|-------|-----|
| Text appears in image | Add "no text, no words, no letters" explicitly |
| Hands look wrong | Add "no hands, no fingers" or crop hands out |
| Too generic/stock-photo | Add specific camera specs and colour anchors |
| Inconsistent style across variants | Use `--reference` with the best variant as input |
| Image has watermark-like patterns | Add "clean image, no watermarks, no artifacts" |
## Web Asset Dimensions
| Use case | Dimensions | Aspect Ratio |
|----------|-----------|--------------|
| Hero banner | 1920x1080 | 16:9 |
| OG / social card | 1200x630 | ~1.9:1 |
| Square thumbnail | 1024x1024 | 1:1 |
| Blog header | 1200x675 | 16:9 |
| Product photo | 1024x1024 | 1:1 |
| Texture/pattern tile | 512x512 | 1:1 |
```
### references/api-pattern.md
```markdown
# Gemini Image Generation API Pattern
Reference implementation for calling the Gemini image generation API. Claude should generate
a tailored script based on the user's needs rather than using this as-is.
## Core Pattern: Text-to-Image
```python
import base64, json, os, urllib.request, urllib.error
def generate_image(model, prompt, api_key, reference_path=None):
"""Generate an image via Gemini API. Returns raw image bytes."""
url = f"https://generativelanguage.googleapis.com/v1beta/models/{model}:generateContent?key={api_key}"
parts = []
# Optional: style reference image
if reference_path:
with open(reference_path, "rb") as f:
img_data = base64.b64encode(f.read()).decode()
ext = reference_path.rsplit(".", 1)[-1].lower()
mime = {"jpg": "image/jpeg", "jpeg": "image/jpeg", "png": "image/png",
"webp": "image/webp"}.get(ext, "image/jpeg")
parts.append({"inlineData": {"mimeType": mime, "data": img_data}})
parts.append({"text": prompt})
payload = json.dumps({
"contents": [{"parts": parts}],
"generationConfig": {
"responseModalities": ["TEXT", "IMAGE"],
"temperature": 0.7 if reference_path else 1.0,
},
}).encode()
req = urllib.request.Request(url, data=payload,
headers={"Content-Type": "application/json"}, method="POST")
with urllib.request.urlopen(req, timeout=120) as resp:
data = json.loads(resp.read().decode())
# Extract image from response candidates
for candidate in data.get("candidates", []):
for part in candidate.get("content", {}).get("parts", []):
if "inlineData" in part:
return base64.b64decode(part["inlineData"]["data"])
# No image — check for text response (model may have refused)
try:
text = data["candidates"][0]["content"]["parts"][0].get("text", "")
raise RuntimeError(f"Gemini returned text instead of image: {text}")
except (KeyError, IndexError):
raise RuntimeError(f"Unexpected response: {json.dumps(data)[:500]}")
```
## Key Details
**Response structure**: Gemini returns `candidates[].content.parts[]`. Each part is either
`{"text": "..."}` or `{"inlineData": {"mimeType": "image/png", "data": "<base64>"}}`.
**Temperature**: Use 0.7 with reference images (closer style match), 1.0 without (more creative).
**responseModalities**: Must include `"IMAGE"` to get image output. `["TEXT", "IMAGE"]` allows
the model to include both text and image parts.
## Multiple Variants
Generate N variants by calling the API N times:
```python
for i in range(count):
img_bytes = generate_image(model, prompt, api_key, reference_path)
if count > 1:
base, ext = output_path.rsplit(".", 1)
path = f"{base}-{i+1}.{ext}"
else:
path = output_path
os.makedirs(os.path.dirname(path) or ".", exist_ok=True)
with open(path, "wb") as f:
f.write(img_bytes)
```
## Shell Escaping Warning
Never pass prompts via curl + bash — apostrophes and special characters break JSON.
Always use Python's `json.dumps()` for serialisation, or write the prompt to a file first.
## Model IDs
| Use case | Model | Cost |
|----------|-------|------|
| Drafts | `gemini-2.5-flash-image` | Free (~500/day) |
| Final assets | `gemini-3-pro-image-preview` | ~$0.04/image |
Verify model IDs if errors occur — they change frequently.
```