Back to skills
SkillHub ClubAnalyze Data & AIFull StackData / AI

repomix-unmixer

Extracts files from repomix-packed repositories, restoring original directory structures from XML/Markdown/JSON formats. Activates when users need to unmix repomix files, extract packed repositories, restore file structures from repomix output, or reverse the repomix packing process.

Packaged view

This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.

Stars
2
Hot score
79
Updated
March 20, 2026
Overall rating
C1.6
Composite score
1.6
Best-practice grade
N/A

Install command

npx @skill-hub/cli install nguyendinhquocx-code-ai-repomix-unmixer
file-extractionrepository-managementdata-recoveryautomationcli-tool

Repository

nguyendinhquocx/code-ai

Skill path: skills/repomix-unmixer

Extracts files from repomix-packed repositories, restoring original directory structures from XML/Markdown/JSON formats. Activates when users need to unmix repomix files, extract packed repositories, restore file structures from repomix output, or reverse the repomix packing process.

Open repository

Best for

Primary workflow: Analyze Data & AI.

Technical facets: Full Stack, Data / AI.

Target audience: everyone.

License: Unknown.

Original source

Catalog source: SkillHub Club.

Repository owner: nguyendinhquocx.

This is still a mirrored public skill entry. Review the repository before installing into production workflows.

What it helps with

  • Install repomix-unmixer into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
  • Review https://github.com/nguyendinhquocx/code-ai before adding repomix-unmixer to shared team environments
  • Use repomix-unmixer for development workflows

Works across

Claude CodeCodex CLIGemini CLIOpenCode

Favorites: 0.

Sub-skills: 0.

Aggregator: No.

Original source / Raw SKILL.md

---
name: repomix-unmixer
description: Extracts files from repomix-packed repositories, restoring original directory structures from XML/Markdown/JSON formats. Activates when users need to unmix repomix files, extract packed repositories, restore file structures from repomix output, or reverse the repomix packing process.
---

# Repomix Unmixer

## Overview

This skill extracts files from repomix-packed repositories and restores their original directory structure. Repomix packs entire repositories into single AI-friendly files (XML, Markdown, or JSON), and this skill reverses that process to restore individual files.

## When to Use This Skill

This skill activates when:
- Unmixing a repomix output file (*.xml, *.md, *.json)
- Extracting files from a packed repository
- Restoring original directory structure from repomix format
- Reviewing or validating repomix-packed content
- Converting repomix output back to usable files

## Core Workflow

### Standard Unmixing Process

Extract all files from a repomix file and restore the original directory structure using the bundled `unmix_repomix.py` script:

```bash
python3 scripts/unmix_repomix.py \
  "<path_to_repomix_file>" \
  "<output_directory>"
```

**Parameters:**
- `<path_to_repomix_file>`: Path to the repomix output file (XML, Markdown, or JSON)
- `<output_directory>`: Directory where files will be extracted (will be created if doesn't exist)

**Example:**
```bash
python3 scripts/unmix_repomix.py \
  "/path/to/repomix-output.xml" \
  "/tmp/extracted-files"
```

### What the Script Does

1. **Parses** the repomix file format (XML, Markdown, or JSON)
2. **Extracts** each file path and content
3. **Creates** the original directory structure
4. **Writes** each file to its original location
5. **Reports** extraction progress and statistics

### Output

The script will:
- Create all necessary parent directories
- Extract all files maintaining their paths
- Print extraction progress for each file
- Display total count of extracted files

**Example output:**
```
Unmixing /path/to/skill.xml...
Output directory: /tmp/extracted-files

✓ Extracted: github-ops/SKILL.md
✓ Extracted: github-ops/references/api_reference.md
✓ Extracted: markdown-tools/SKILL.md
...

✅ Successfully extracted 20 files!

Extracted files are in: /tmp/extracted-files
```

## Supported Formats

### XML Format (default)

Repomix XML format structure:
```xml
<file path="relative/path/to/file.ext">
file content here
</file>
```

The script uses regex to match `<file path="...">content</file>` blocks.

### Markdown Format

For markdown-style repomix output with file markers:
```markdown
## File: relative/path/to/file.ext
```
file content
```
```

Refer to `references/repomix-format.md` for detailed format specifications.

### JSON Format

For JSON-style repomix output:
```json
{
  "files": [
    {
      "path": "relative/path/to/file.ext",
      "content": "file content here"
    }
  ]
}
```

## Common Use Cases

### Use Case 1: Unmix Claude Skills

Extract skills that were shared as a repomix file:

```bash
python3 scripts/unmix_repomix.py \
  "/path/to/skills.xml" \
  "/tmp/unmixed-skills"
```

Then review, validate, or install the extracted skills.

### Use Case 2: Extract Repository for Review

Extract a packed repository to review its structure and contents:

```bash
python3 scripts/unmix_repomix.py \
  "/path/to/repo-output.xml" \
  "/tmp/review-repo"

# Review the structure
tree /tmp/review-repo
```

### Use Case 3: Restore Working Files

Restore files from a repomix backup to a working directory:

```bash
python3 scripts/unmix_repomix.py \
  "/path/to/backup.xml" \
  "~/workspace/restored-project"
```

## Validation Workflow

After unmixing, validate the extracted files are correct:

1. **Check file count**: Verify the number of extracted files matches expectations
2. **Review structure**: Use `tree` or `ls -R` to inspect directory layout
3. **Spot check content**: Read a few key files to verify content integrity
4. **Run validation**: For skills, use the skill-creator validation tools

Refer to `references/validation-workflow.md` for detailed validation procedures, especially for unmixing Claude skills.

## Important Principles

### Always Specify Output Directory

Always provide an output directory to avoid cluttering the current working directory:

```bash
# Good: Explicit output directory
python3 scripts/unmix_repomix.py \
  "input.xml" "/tmp/output"

# Avoid: Default output (may clutter current directory)
python3 scripts/unmix_repomix.py "input.xml"
```

### Use Temporary Directories for Review

Extract to temporary directories first for review:

```bash
# Extract to /tmp for review
python3 scripts/unmix_repomix.py \
  "skills.xml" "/tmp/review-skills"

# Review the contents
tree /tmp/review-skills

# If satisfied, copy to final destination
cp -r /tmp/review-skills ~/.claude/skills/
```

### Verify Before Overwriting

Never extract directly to important directories without review:

```bash
# Bad: Might overwrite existing files
python3 scripts/unmix_repomix.py \
  "repo.xml" "~/workspace/my-project"

# Good: Extract to temp, review, then move
python3 scripts/unmix_repomix.py \
  "repo.xml" "/tmp/extracted"
# Review, then:
mv /tmp/extracted ~/workspace/my-project
```

## Troubleshooting

### No Files Extracted

**Issue**: Script completes but no files are extracted.

**Possible causes:**
- Wrong file format (not a repomix file)
- Unsupported repomix format version
- File path pattern doesn't match

**Solution:**
1. Verify the input file is a repomix output file
2. Check the format (XML/Markdown/JSON)
3. Examine the file structure manually
4. Refer to `references/repomix-format.md` for format details

### Permission Errors

**Issue**: Cannot write to output directory.

**Solution:**
```bash
# Ensure output directory is writable
mkdir -p /tmp/output
chmod 755 /tmp/output

# Or use a directory you own
python3 scripts/unmix_repomix.py \
  "input.xml" "$HOME/extracted"
```

### Encoding Issues

**Issue**: Special characters appear garbled in extracted files.

**Solution:**
The script uses UTF-8 encoding by default. If issues persist:
- Check the original repomix file encoding
- Verify the file was created correctly
- Report the issue with specific character examples

### Path Already Exists

**Issue**: Files exist at extraction path.

**Solution:**
```bash
# Option 1: Use a fresh output directory
python3 scripts/unmix_repomix.py \
  "input.xml" "/tmp/output-$(date +%s)"

# Option 2: Clear the directory first
rm -rf /tmp/output && mkdir /tmp/output
python3 scripts/unmix_repomix.py \
  "input.xml" "/tmp/output"
```

## Best Practices

1. **Extract to temp directories** - Always extract to `/tmp` or similar for initial review
2. **Verify file count** - Check that extracted file count matches expectations
3. **Review structure** - Use `tree` to inspect directory layout before use
4. **Check content** - Spot-check a few files to ensure content is intact
5. **Use validation tools** - For skills, use skill-creator validation after unmixing
6. **Preserve originals** - Keep the original repomix file as backup

## Resources

### scripts/unmix_repomix.py

Main unmixing script that:
- Parses repomix XML/Markdown/JSON formats
- Extracts file paths and content using regex
- Creates directory structures automatically
- Writes files to their original locations
- Reports extraction progress and statistics

The script is self-contained and requires only Python 3 standard library.

### references/repomix-format.md

Comprehensive documentation of repomix file formats including:
- XML format structure and examples
- Markdown format patterns
- JSON format schema
- File path encoding rules
- Content extraction patterns
- Format version differences

Load this reference when dealing with format-specific issues or supporting new repomix versions.

### references/validation-workflow.md

Detailed validation procedures for extracted content including:
- File count verification steps
- Directory structure validation
- Content integrity checks
- Skill-specific validation using skill-creator tools
- Quality assurance checklists

Load this reference when users need to validate unmixed skills or verify extraction quality.


---

## Referenced Files

> The following files are referenced in this skill and included for context.

### scripts/unmix_repomix.py

```python
#!/usr/bin/env python3
"""Unmix a repomix file to restore original file structure.

Supports XML, Markdown, and JSON repomix output formats.
"""

import re
import os
import sys
import json
from pathlib import Path


def unmix_xml(content, output_dir):
    """Extract files from repomix XML format."""
    # Pattern: <file path="...">content</file>
    file_pattern = r'<file path="([^"]+)">\n(.*?)\n</file>'
    matches = re.finditer(file_pattern, content, re.DOTALL)

    extracted_files = []
    for match in matches:
        file_path = match.group(1)
        file_content = match.group(2)

        # Create full output path
        full_path = Path(output_dir) / file_path
        full_path.parent.mkdir(parents=True, exist_ok=True)

        # Write the file
        with open(full_path, 'w', encoding='utf-8') as f:
            f.write(file_content)

        extracted_files.append(file_path)
        print(f"✓ Extracted: {file_path}")

    return extracted_files


def unmix_markdown(content, output_dir):
    """Extract files from repomix Markdown format."""
    # Pattern: ## File: path\n```\ncontent\n```
    file_pattern = r'## File: ([^\n]+)\n```[^\n]*\n(.*?)\n```'
    matches = re.finditer(file_pattern, content, re.DOTALL)

    extracted_files = []
    for match in matches:
        file_path = match.group(1).strip()
        file_content = match.group(2)

        # Create full output path
        full_path = Path(output_dir) / file_path
        full_path.parent.mkdir(parents=True, exist_ok=True)

        # Write the file
        with open(full_path, 'w', encoding='utf-8') as f:
            f.write(file_content)

        extracted_files.append(file_path)
        print(f"✓ Extracted: {file_path}")

    return extracted_files


def unmix_json(content, output_dir):
    """Extract files from repomix JSON format."""
    try:
        data = json.loads(content)
        files = data.get('files', [])

        extracted_files = []
        for file_entry in files:
            file_path = file_entry.get('path')
            file_content = file_entry.get('content', '')

            if not file_path:
                continue

            # Create full output path
            full_path = Path(output_dir) / file_path
            full_path.parent.mkdir(parents=True, exist_ok=True)

            # Write the file
            with open(full_path, 'w', encoding='utf-8') as f:
                f.write(file_content)

            extracted_files.append(file_path)
            print(f"✓ Extracted: {file_path}")

        return extracted_files
    except json.JSONDecodeError as e:
        print(f"Error: Failed to parse JSON: {e}")
        return []


def detect_format(content):
    """Detect the repomix file format."""
    # Check for XML format
    if '<file path=' in content and '</file>' in content:
        return 'xml'

    # Check for JSON format
    if content.strip().startswith('{') and '"files"' in content:
        return 'json'

    # Check for Markdown format
    if '## File:' in content:
        return 'markdown'

    return None


def unmix_repomix(repomix_file, output_dir):
    """Extract files from a repomix file (auto-detects format)."""

    # Read the repomix file
    with open(repomix_file, 'r', encoding='utf-8') as f:
        content = f.read()

    # Detect format
    format_type = detect_format(content)

    if format_type is None:
        print("Error: Could not detect repomix format")
        print("Expected XML (<file path=...>), Markdown (## File:), or JSON format")
        return []

    print(f"Detected format: {format_type.upper()}")

    # Extract based on format
    if format_type == 'xml':
        return unmix_xml(content, output_dir)
    elif format_type == 'markdown':
        return unmix_markdown(content, output_dir)
    elif format_type == 'json':
        return unmix_json(content, output_dir)

    return []


def main():
    """Main entry point."""
    if len(sys.argv) < 2:
        print("Usage: unmix_repomix.py <repomix_file> [output_directory]")
        print()
        print("Arguments:")
        print("  repomix_file       Path to the repomix output file (XML, Markdown, or JSON)")
        print("  output_directory   Optional: Directory to extract files to (default: ./extracted)")
        print()
        print("Examples:")
        print("  unmix_repomix.py skills.xml /tmp/extracted-skills")
        print("  unmix_repomix.py repo-output.md")
        sys.exit(1)

    repomix_file = sys.argv[1]
    output_dir = sys.argv[2] if len(sys.argv) > 2 else "./extracted"

    # Validate input file exists
    if not os.path.exists(repomix_file):
        print(f"Error: File not found: {repomix_file}")
        sys.exit(1)

    print(f"Unmixing {repomix_file}...")
    print(f"Output directory: {output_dir}\n")

    # Extract files
    extracted = unmix_repomix(repomix_file, output_dir)

    if not extracted:
        print("\n⚠️  No files extracted!")
        print("Check that the input file is a valid repomix output file.")
        sys.exit(1)

    print(f"\n✅ Successfully extracted {len(extracted)} files!")
    print(f"\nExtracted files are in: {output_dir}")


if __name__ == "__main__":
    main()

```

### references/repomix-format.md

```markdown
# Repomix File Format Reference

This document provides comprehensive documentation of repomix output formats for accurate file extraction.

## Overview

Repomix can generate output in three formats:
1. **XML** (default) - Most common, uses XML tags
2. **Markdown** - Human-readable, uses markdown code blocks
3. **JSON** - Structured data format

## XML Format

### Structure

The XML format is the default and most common repomix output:

```xml
<file_summary>
  [Summary and metadata about the packed repository]
</file_summary>

<directory_structure>
  [Text-based directory tree visualization]
</directory_structure>

<files>
  <file path="relative/path/to/file1.ext">
  content of file1
  </file>

  <file path="relative/path/to/file2.ext">
  content of file2
  </file>
</files>
```

### File Block Pattern

Each file is enclosed in a `<file>` tag with a `path` attribute:

```xml
<file path="src/main.py">
#!/usr/bin/env python3

def main():
    print("Hello, world!")

if __name__ == "__main__":
    main()
</file>
```

### Key Characteristics

- File path is in the `path` attribute (relative path)
- Content starts on the line after the opening tag
- Content ends on the line before the closing tag
- No leading/trailing blank lines in content (content is trimmed)

### Extraction Pattern

The unmixing script uses this regex pattern:

```python
r'<file path="([^"]+)">\n(.*?)\n</file>'
```

**Pattern breakdown:**
- `<file path="([^"]+)">` - Captures the file path from the path attribute
- `\n` - Expects a newline after opening tag
- `(.*?)` - Captures file content (non-greedy, allows multiline)
- `\n</file>` - Expects newline before closing tag

## Markdown Format

### Structure

The Markdown format uses code blocks to delimit file content:

````markdown
# Repository Summary

[Summary content]

## Directory Structure

```
directory/
  file1.txt
  file2.txt
```

## Files

### File: relative/path/to/file1.ext

```python
# File content here
def example():
    pass
```

### File: relative/path/to/file2.ext

```javascript
// Another file
console.log("Hello");
```
````

### File Block Pattern

Each file uses a level-3 heading with "File:" prefix and code block:

````markdown
### File: src/main.py

```python
#!/usr/bin/env python3

def main():
    print("Hello, world!")
```
````

### Key Characteristics

- File path follows "### File: " heading
- Content is within a code block (triple backticks)
- Language hint may be included after opening backticks
- Content preserves original formatting

### Extraction Pattern

```python
r'## File: ([^\n]+)\n```[^\n]*\n(.*?)\n```'
```

**Pattern breakdown:**
- `## File: ([^\n]+)` - Captures file path from heading
- `\n` - Newline after heading
- ``` `[^\n]*` ``` - Opening code block with optional language
- `\n(.*?)\n` - Captures content between backticks
- ``` ` ``` ``` - Closing backticks

## JSON Format

### Structure

The JSON format provides structured data:

```json
{
  "metadata": {
    "repository": "owner/repo",
    "timestamp": "2025-10-22T19:00:00Z"
  },
  "directoryStructure": "directory/\n  file1.txt\n  file2.txt\n",
  "files": [
    {
      "path": "relative/path/to/file1.ext",
      "content": "content of file1\n"
    },
    {
      "path": "relative/path/to/file2.ext",
      "content": "content of file2\n"
    }
  ]
}
```

### File Entry Structure

Each file is an object in the `files` array:

```json
{
  "path": "src/main.py",
  "content": "#!/usr/bin/env python3\n\ndef main():\n    print(\"Hello, world!\")\n\nif __name__ == \"__main__\":\n    main()\n"
}
```

### Key Characteristics

- Files are in a `files` array
- Each file has `path` and `content` keys
- Content includes literal `\n` for newlines
- Content is JSON-escaped (quotes, backslashes)

### Extraction Approach

```python
data = json.loads(content)
files = data.get('files', [])
for file_entry in files:
    file_path = file_entry.get('path')
    file_content = file_entry.get('content', '')
```

## Format Detection

### Detection Logic

The unmixing script auto-detects format using these checks:

1. **XML**: Contains `<file path=` and `</file>`
2. **JSON**: Starts with `{` and contains `"files"`
3. **Markdown**: Contains `## File:`

### Detection Priority

1. Check XML markers first (most common)
2. Check JSON structure second
3. Check Markdown markers last
4. Return `None` if no format matches

### Example Detection Code

```python
def detect_format(content):
    if '<file path=' in content and '</file>' in content:
        return 'xml'
    if content.strip().startswith('{') and '"files"' in content:
        return 'json'
    if '## File:' in content:
        return 'markdown'
    return None
```

## File Path Encoding

### Relative Paths

All file paths in repomix output are relative to the repository root:

```
src/components/Header.tsx
docs/README.md
package.json
```

### Special Characters

File paths may contain:
- Spaces: `"My Documents/file.txt"`
- Hyphens: `"some-file.md"`
- Underscores: `"my_script.py"`
- Dots: `"config.local.json"`

Paths are preserved exactly as they appear in the original repository.

### Directory Separators

- Always forward slashes (`/`) regardless of platform
- No leading slash (relative paths)
- No trailing slash for files

## Content Encoding

### Character Encoding

All formats use **UTF-8** encoding for both the container file and extracted content.

### Special Characters

- **XML**: Content may contain XML-escaped characters (`&lt;`, `&gt;`, `&amp;`)
- **Markdown**: Content is plain text within code blocks
- **JSON**: Content uses JSON string escaping (`\"`, `\\`, `\n`)

### Line Endings

- Original line endings are preserved
- May be `\n` (Unix), `\r\n` (Windows), or `\r` (old Mac)
- Extraction preserves original endings

## Edge Cases

### Empty Files

**XML:**
```xml
<file path="empty.txt">
</file>
```

**Markdown:**
````markdown
### File: empty.txt

```
```
````

**JSON:**
```json
{"path": "empty.txt", "content": ""}
```

### Binary Files

Binary files are typically **not included** in repomix output. The directory structure may list them, but they won't have content blocks.

### Large Files

Some repomix configurations may truncate or exclude large files. Check the file summary section for exclusion notes.

## Version Differences

### Repomix v1.x

- Uses XML format by default
- File blocks have consistent structure
- No automatic format version marker

### Repomix v2.x

- Adds JSON and Markdown format support
- May include version metadata in output
- Maintains backward compatibility with v1 XML

## Validation

### Successful Extraction Indicators

After extraction, verify:
1. **File count** matches expected number
2. **Directory structure** matches the `<directory_structure>` section
3. **Content integrity** - spot-check a few files
4. **No empty directories** unless explicitly included

### Common Format Issues

**Issue**: Files not extracted
- **Cause**: Format pattern mismatch
- **Solution**: Check format manually, verify repomix version

**Issue**: Partial content extraction
- **Cause**: Incorrect regex pattern (too greedy or not greedy enough)
- **Solution**: Check for nested tags or malformed blocks

**Issue**: Encoding errors
- **Cause**: Non-UTF-8 content in repomix file
- **Solution**: Verify source file encoding

## Examples

### Complete XML Example

```xml
<file_summary>
This is a packed repository.
</file_summary>

<directory_structure>
my-skill/
  SKILL.md
  scripts/
    helper.py
</directory_structure>

<files>
<file path="my-skill/SKILL.md">
---
name: my-skill
description: Example skill
---

# My Skill

This is an example.
</file>

<file path="my-skill/scripts/helper.py">
#!/usr/bin/env python3

def help():
    print("Helping!")
</file>
</files>
```

### Complete Markdown Example

````markdown
# Repository: my-skill

## Directory Structure

```
my-skill/
  SKILL.md
  scripts/
    helper.py
```

## Files

### File: my-skill/SKILL.md

```markdown
---
name: my-skill
description: Example skill
---

# My Skill

This is an example.
```

### File: my-skill/scripts/helper.py

```python
#!/usr/bin/env python3

def help():
    print("Helping!")
```
````

### Complete JSON Example

```json
{
  "metadata": {
    "repository": "my-skill"
  },
  "directoryStructure": "my-skill/\n  SKILL.md\n  scripts/\n    helper.py\n",
  "files": [
    {
      "path": "my-skill/SKILL.md",
      "content": "---\nname: my-skill\ndescription: Example skill\n---\n\n# My Skill\n\nThis is an example.\n"
    },
    {
      "path": "my-skill/scripts/helper.py",
      "content": "#!/usr/bin/env python3\n\ndef help():\n    print(\"Helping!\")\n"
    }
  ]
}
```

## References

- Repomix documentation: https://github.com/yamadashy/repomix
- Repomix output examples: Check the repomix repository for sample outputs
- XML specification: https://www.w3.org/XML/
- JSON specification: https://www.json.org/

```

### references/validation-workflow.md

```markdown
# Validation Workflow for Unmixed Content

This guide provides detailed validation procedures for verifying the quality and correctness of unmixed repomix content, with special focus on Claude Code skills.

## Overview

After unmixing a repomix file, validation ensures:
- All files were extracted correctly
- Directory structure is intact
- Content integrity is preserved
- Skills (if applicable) meet Claude Code requirements

## General Validation Workflow

### Step 1: File Count Verification

Compare the extracted file count with the expected count.

**Check extraction output:**
```
✅ Successfully extracted 20 files!
```

**Verify against directory structure:**
```bash
# Count files in the repomix directory structure section
grep -c "^  " repomix-file.xml

# Count extracted files
find /tmp/extracted -type f | wc -l
```

**Expected result:** Counts should match (accounting for any excluded binary files).

### Step 2: Directory Structure Validation

Compare the extracted structure with the repomix directory structure section.

**Extract directory structure from repomix file:**
```bash
# For XML format
sed -n '/<directory_structure>/,/<\/directory_structure>/p' repomix-file.xml
```

**Compare with extracted structure:**
```bash
tree /tmp/extracted
# or
ls -R /tmp/extracted
```

**Validation checks:**
- [ ] All directories present
- [ ] Nesting levels match
- [ ] No unexpected directories

### Step 3: Content Integrity Spot Checks

Randomly select 3-5 files to verify content integrity.

**Check file size:**
```bash
# Compare sizes (should be reasonable)
ls -lh /tmp/extracted/path/to/file.txt
```

**Check content:**
```bash
# Read the file and verify it looks correct
cat /tmp/extracted/path/to/file.txt
```

**Validation checks:**
- [ ] Content is readable (UTF-8 encoded)
- [ ] No obvious truncation
- [ ] Code/markup is properly formatted
- [ ] No XML/JSON escape artifacts (e.g., `&lt;` instead of `<`)

### Step 4: File Type Distribution

Verify that expected file types are present.

**Check file types:**
```bash
# List all file extensions
find /tmp/extracted -type f | sed 's/.*\.//' | sort | uniq -c
```

**Expected distributions:**
- Skills: `.md`, `.py`, `.sh`, `.json`, etc.
- Projects: Language-specific extensions
- Documentation: `.md`, `.txt`, `.pdf`, etc.

## Skill-Specific Validation

For Claude Code skills extracted from repomix files, perform additional validation.

### Step 1: Verify Skill Structure

Check that each skill has the required `SKILL.md` file.

**Find all SKILL.md files:**
```bash
find /tmp/extracted -name "SKILL.md"
```

**Expected result:** One `SKILL.md` per skill directory.

### Step 2: Validate YAML Frontmatter

Each `SKILL.md` must have valid YAML frontmatter with `name` and `description`.

**Check frontmatter:**
```bash
head -n 5 /tmp/extracted/skill-name/SKILL.md
```

**Expected format:**
```yaml
---
name: skill-name
description: Clear description with activation triggers
---
```

**Validation checks:**
- [ ] Opening `---` on line 1
- [ ] `name:` field present
- [ ] `description:` field present
- [ ] Closing `---` present
- [ ] Description mentions when to activate

### Step 3: Verify Resource Organization

Check that bundled resources follow the proper structure.

**Check directory structure:**
```bash
tree /tmp/extracted/skill-name
```

**Expected structure:**
```
skill-name/
├── SKILL.md (required)
├── scripts/ (optional)
│   └── *.py, *.sh
├── references/ (optional)
│   └── *.md
└── assets/ (optional)
    └── templates, images, etc.
```

**Validation checks:**
- [ ] `SKILL.md` exists at root
- [ ] Resources organized in proper directories
- [ ] No unexpected directories (e.g., `__pycache__`, `.git`)

### Step 4: Validate with skill-creator

Use the skill-creator validation tools for comprehensive validation.

**Run quick validation:**
```bash
~/.claude/plugins/marketplaces/anthropics-skills/skill-creator/scripts/quick_validate.py \
  /tmp/extracted/skill-name
```

**Expected output:**
```
✅ Skill structure is valid
✅ YAML frontmatter is valid
✅ Description is informative
✅ All resource references are valid
```

**Common validation errors:**
- Missing or malformed YAML frontmatter
- Description too short or missing activation criteria
- References to non-existent files
- Improper directory structure

### Step 5: Content Quality Checks

Verify the content quality of each skill.

**Check SKILL.md length:**
```bash
wc -l /tmp/extracted/skill-name/SKILL.md
```

**Recommended:** 100-500 lines for most skills (lean, with details in references).

**Check for TODOs:**
```bash
grep -i "TODO" /tmp/extracted/skill-name/SKILL.md
```

**Expected result:** No TODOs (unless intentional).

**Check writing style:**
```bash
# Should use imperative/infinitive form
head -n 50 /tmp/extracted/skill-name/SKILL.md
```

**Validation checks:**
- [ ] Uses imperative form ("Extract files from..." not "You extract files...")
- [ ] Clear section headings
- [ ] Code examples properly formatted
- [ ] Resources properly referenced

### Step 6: Bundled Resource Validation

Verify bundled scripts, references, and assets are intact.

**Check scripts are executable:**
```bash
ls -l /tmp/extracted/skill-name/scripts/
```

**Check for shebang in Python/Bash scripts:**
```bash
head -n 1 /tmp/extracted/skill-name/scripts/*.py
head -n 1 /tmp/extracted/skill-name/scripts/*.sh
```

**Expected:** `#!/usr/bin/env python3` or `#!/bin/bash`

**Verify references are markdown:**
```bash
file /tmp/extracted/skill-name/references/*.md
```

**Expected:** All files are text/UTF-8

**Validation checks:**
- [ ] Scripts have proper shebangs
- [ ] Scripts are executable (or will be made executable)
- [ ] References are readable markdown
- [ ] Assets are in expected formats

## Automated Validation Script

For batch validation of multiple skills:

```bash
#!/bin/bash
# validate_all_skills.sh

EXTRACTED_DIR="/tmp/extracted"
SKILL_CREATOR_VALIDATOR="$HOME/.claude/plugins/marketplaces/anthropics-skills/skill-creator/scripts/quick_validate.py"

echo "Validating all skills in $EXTRACTED_DIR..."

for skill_dir in "$EXTRACTED_DIR"/*; do
    if [ -d "$skill_dir" ] && [ -f "$skill_dir/SKILL.md" ]; then
        skill_name=$(basename "$skill_dir")
        echo ""
        echo "=== Validating: $skill_name ==="

        # Run quick validation
        if [ -f "$SKILL_CREATOR_VALIDATOR" ]; then
            python3 "$SKILL_CREATOR_VALIDATOR" "$skill_dir"
        else
            echo "⚠️  Skill creator validator not found, skipping automated validation"
        fi

        # Check for TODOs
        if grep -q "TODO" "$skill_dir/SKILL.md"; then
            echo "⚠️  Warning: Found TODOs in SKILL.md"
        fi

        # Count files
        file_count=$(find "$skill_dir" -type f | wc -l)
        echo "📁 Files: $file_count"
    fi
done

echo ""
echo "✅ Validation complete!"
```

**Usage:**
```bash
bash validate_all_skills.sh
```

## Quality Assurance Checklist

Use this checklist after unmixing:

### General Extraction Quality
- [ ] File count matches expected count
- [ ] Directory structure matches repomix directory listing
- [ ] No extraction errors in console output
- [ ] All files are UTF-8 encoded and readable
- [ ] No binary files incorrectly extracted as text

### Skill Quality (if applicable)
- [ ] Each skill has a valid `SKILL.md`
- [ ] YAML frontmatter is well-formed
- [ ] Description includes activation triggers
- [ ] Writing style is imperative/infinitive
- [ ] Resources are properly organized (scripts/, references/, assets/)
- [ ] No TODOs or placeholder text
- [ ] Scripts have proper shebangs and permissions
- [ ] References are informative markdown
- [ ] skill-creator validation passes

### Content Integrity
- [ ] Random spot-checks show correct content
- [ ] Code examples are properly formatted
- [ ] No XML/JSON escape artifacts
- [ ] File sizes are reasonable
- [ ] No truncated files

### Ready for Use
- [ ] Extracted to appropriate location
- [ ] Scripts made executable (if needed)
- [ ] Skills ready for installation to `~/.claude/skills/`
- [ ] Documentation reviewed and understood

## Common Issues and Solutions

### Issue: File Count Mismatch

**Symptom:** Fewer files extracted than expected.

**Possible causes:**
- Binary files excluded (expected)
- Malformed file blocks in repomix file
- Wrong format detection

**Solution:**
1. Check repomix `<file_summary>` section for exclusion notes
2. Manually inspect repomix file for file blocks
3. Verify format detection was correct

### Issue: Malformed YAML Frontmatter

**Symptom:** skill-creator validation fails on YAML.

**Possible causes:**
- Extraction didn't preserve line breaks correctly
- Content had literal `---` that broke frontmatter

**Solution:**
1. Manually inspect `SKILL.md` frontmatter
2. Ensure opening `---` is on line 1
3. Ensure closing `---` is on its own line
4. Check for stray `---` in description

### Issue: Missing Resource Files

**Symptom:** References to scripts/references not found.

**Possible causes:**
- Resource files excluded from repomix
- Extraction path mismatch

**Solution:**
1. Check repomix file for resource file blocks
2. Verify resource was in original packed content
3. Check extraction console output for errors

### Issue: Permission Errors on Scripts

**Symptom:** Scripts not executable.

**Possible causes:**
- Permissions not preserved during extraction
- Scripts need to be marked executable

**Solution:**
```bash
# Make all scripts executable
find /tmp/extracted -name "*.py" -exec chmod +x {} \;
find /tmp/extracted -name "*.sh" -exec chmod +x {} \;
```

### Issue: Encoding Problems

**Symptom:** Special characters appear garbled.

**Possible causes:**
- Repomix file not UTF-8
- Extraction script encoding mismatch

**Solution:**
1. Verify repomix file encoding: `file -i repomix-file.xml`
2. Re-extract with explicit UTF-8 encoding
3. Check original files for encoding issues

## Post-Validation Actions

### For Valid Skills

**Install to Claude Code:**
```bash
# Copy to skills directory
cp -r /tmp/extracted/skill-name ~/.claude/skills/

# Restart Claude Code to load the skill
```

**Package for distribution:**
```bash
~/.claude/plugins/marketplaces/anthropics-skills/skill-creator/scripts/package_skill.py \
  /tmp/extracted/skill-name
```

### For Invalid Skills

**Document issues:**
- Create an issues list
- Note specific validation failures
- Identify required fixes

**Fix issues:**
- Manually edit extracted files
- Re-validate after fixes
- Document changes made

**Re-package if needed:**
- Once fixed, re-validate
- Package for distribution
- Test in Claude Code

## Best Practices

1. **Always validate before use** - Don't skip validation steps
2. **Extract to temp first** - Review before installing
3. **Use automated tools** - skill-creator validation for skills
4. **Document findings** - Keep notes on any issues
5. **Preserve originals** - Keep the repomix file as backup
6. **Spot-check content** - Don't rely solely on automated checks
7. **Test in isolation** - Install one skill at a time for testing

## References

- Skill creator documentation: `~/.claude/plugins/marketplaces/anthropics-skills/skill-creator/SKILL.md`
- Skill authoring best practices: https://docs.claude.com/en/docs/agents-and-tools/agent-skills/best-practices.md
- Claude Code skills directory: `~/.claude/skills/`

```