ai-multimodal_mrgoonie
Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (captioning, object detection, OCR, visual Q&A, segmentation), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image, editing, composition, refinement). Use when working with audio/video files, analyzing images or screenshots, processing PDF documents, extracting structured data from media, creating images from text prompts, or implementing multimodal AI features. Supports multiple models (Gemini 2.5/2.0) with context windows up to 2M tokens.
Packaged view
This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.
Install command
npx @skill-hub/cli install jackspace-claudeskillz-ai-multimodal-mrgoonie
Repository
Skill path: skills/ai-multimodal_mrgoonie
Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (captioning, object detection, OCR, visual Q&A, segmentation), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image, editing, composition, refinement). Use when working with audio/video files, analyzing images or screenshots, processing PDF documents, extracting structured data from media, creating images from text prompts, or implementing multimodal AI features. Supports multiple models (Gemini 2.5/2.0) with context windows up to 2M tokens.
Open repositoryBest for
Primary workflow: Write Technical Docs.
Technical facets: Full Stack, Backend, Data / AI, Tech Writer, Designer.
Target audience: Development teams looking for install-ready agent workflows..
License: Unknown.
Original source
Catalog source: SkillHub Club.
Repository owner: jackspace.
This is still a mirrored public skill entry. Review the repository before installing into production workflows.
What it helps with
- Install ai-multimodal_mrgoonie into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
- Review https://github.com/jackspace/ClaudeSkillz before adding ai-multimodal_mrgoonie to shared team environments
- Use ai-multimodal_mrgoonie for development workflows
Works across
Favorites: 0.
Sub-skills: 0.
Aggregator: No.