Back to skills
SkillHub ClubShip Full StackFull Stack

pytorch-model-cli

Guidance for creating standalone CLI tools that perform neural network inference by extracting PyTorch model weights and reimplementing inference in C/C++. This skill applies when tasks involve converting PyTorch models to standalone executables, extracting model weights to portable formats (JSON), implementing neural network forward passes in C/C++, or creating CLI tools that load images and run inference without Python dependencies.

Packaged view

This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.

Stars
784
Hot score
99
Updated
March 20, 2026
Overall rating
C4.3
Composite score
4.3
Best-practice grade
B81.2

Install command

npx @skill-hub/cli install benchflow-ai-skillsbench-pytorch-model-cli

Repository

benchflow-ai/SkillsBench

Skill path: registry/terminal_bench_2.0/letta_skills_batch/terminal_bench_2_0_pytorch-model-cli/environment/skills/pytorch-model-cli

Guidance for creating standalone CLI tools that perform neural network inference by extracting PyTorch model weights and reimplementing inference in C/C++. This skill applies when tasks involve converting PyTorch models to standalone executables, extracting model weights to portable formats (JSON), implementing neural network forward passes in C/C++, or creating CLI tools that load images and run inference without Python dependencies.

Open repository

Best for

Primary workflow: Ship Full Stack.

Technical facets: Full Stack.

Target audience: everyone.

License: Unknown.

Original source

Catalog source: SkillHub Club.

Repository owner: benchflow-ai.

This is still a mirrored public skill entry. Review the repository before installing into production workflows.

What it helps with

  • Install pytorch-model-cli into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
  • Review https://github.com/benchflow-ai/SkillsBench before adding pytorch-model-cli to shared team environments
  • Use pytorch-model-cli for development workflows

Works across

Claude CodeCodex CLIGemini CLIOpenCode

Favorites: 0.

Sub-skills: 0.

Aggregator: No.

Original source / Raw SKILL.md

---
name: pytorch-model-cli
description: Guidance for creating standalone CLI tools that perform neural network inference by extracting PyTorch model weights and reimplementing inference in C/C++. This skill applies when tasks involve converting PyTorch models to standalone executables, extracting model weights to portable formats (JSON), implementing neural network forward passes in C/C++, or creating CLI tools that load images and run inference without Python dependencies.
---

# PyTorch Model to CLI Tool Conversion

This skill provides guidance for tasks that require converting PyTorch models into standalone command-line tools, typically implemented in C/C++ for portability and independence from Python runtime.

## Task Recognition

This skill applies when the task involves:
- Converting a PyTorch model to a standalone executable
- Extracting model weights to a portable format (JSON, binary)
- Implementing neural network inference in C/C++
- Creating CLI tools that perform image classification or prediction
- Building inference tools using libraries like cJSON and lodepng

## Recommended Approach

### Phase 1: Environment Analysis

Before writing any code, thoroughly analyze the available resources:

1. **Identify the model architecture**
   - Read the model definition file (e.g., `model.py`) completely
   - Document all layer types, dimensions, and activation functions
   - Note any default parameters (hidden dimensions, number of classes)

2. **Examine available libraries**
   - Check for image loading libraries (lodepng, stb_image)
   - Check for JSON parsing libraries (cJSON, nlohmann/json)
   - Identify compilation requirements (headers, source files)

3. **Understand input requirements**
   - Determine expected image dimensions (e.g., 28x28 for MNIST)
   - Identify color format (grayscale, RGB, RGBA)
   - Document normalization requirements (divide by 255, mean/std normalization)

4. **Verify preprocessing pipeline**
   - If training code is available, examine data transformations
   - Match inference preprocessing exactly to training preprocessing
   - Common transformations: resize, grayscale conversion, normalization

### Phase 2: Weight Extraction

Extract model weights from PyTorch format to a portable format:

1. **Load the model checkpoint**
   ```python
   import torch
   import json

   # Load state dict
   state_dict = torch.load('model.pth', map_location='cpu')
   ```

2. **Convert tensors to lists**
   ```python
   weights = {}
   for key, tensor in state_dict.items():
       weights[key] = tensor.numpy().tolist()
   ```

3. **Save to JSON**
   ```python
   with open('weights.json', 'w') as f:
       json.dump(weights, f)
   ```

4. **Verify extraction**
   - Check that all expected layer weights are present
   - Verify dimensions match the model architecture
   - For a model with layers fc1, fc2, fc3: expect fc1.weight, fc1.bias, etc.

### Phase 3: Reference Implementation

Before implementing in C/C++, create a reference output:

1. **Run inference in PyTorch**
   ```python
   model.eval()
   with torch.no_grad():
       output = model(input_tensor)
       prediction = output.argmax().item()
   ```

2. **Save reference outputs**
   - Store intermediate layer outputs for debugging
   - Record the final prediction for verification
   - This allows validating the C/C++ implementation

### Phase 4: C/C++ Implementation

Implement the inference logic in C/C++:

1. **Image loading and preprocessing**
   - Load image using the available library (lodepng for PNG)
   - Handle color channel conversion (RGBA to grayscale if needed)
   - Apply normalization (typically divide by 255.0)
   - Flatten to 1D array in correct order (row-major)

2. **Weight loading**
   - Parse JSON file containing weights
   - Store weights in appropriate data structures
   - Verify dimensions during loading

3. **Forward pass implementation**
   - Implement matrix-vector multiplication for linear layers
   - Implement activation functions (ReLU, softmax, etc.)
   - Process layers in correct order

4. **Output handling**
   - Find argmax for classification tasks
   - Write prediction to output file
   - Ensure only prediction goes to stdout (not progress/debug info)

### Phase 5: Compilation and Testing

1. **Compile with appropriate flags**
   ```bash
   g++ -o cli_tool main.cpp lodepng.cpp cJSON.c -std=c++11 -lm
   ```
   - Double-check flag syntax (avoid concatenation errors like `-std=c++11-lm`)

2. **Test against reference**
   - Run the CLI tool on the same input used for reference
   - Compare output to PyTorch reference
   - Debug any discrepancies by checking intermediate values

## Verification Strategies

### Before Implementation
- [ ] Model architecture fully documented
- [ ] All layer dimensions verified
- [ ] Preprocessing requirements identified
- [ ] Reference output generated from PyTorch

### After Weight Extraction
- [ ] All expected keys present in JSON
- [ ] Weight dimensions match architecture
- [ ] Bias terms included for all layers

### After C/C++ Implementation
- [ ] Compilation succeeds without warnings
- [ ] Output matches PyTorch reference exactly
- [ ] CLI tool handles missing files gracefully
- [ ] Only prediction output goes to stdout

### Final Validation
- [ ] All test cases pass
- [ ] Memory properly managed (no leaks)
- [ ] Error messages go to stderr, not stdout

## Common Pitfalls

### Weight Extraction
- **Forgetting to use `map_location='cpu'`** when loading on CPU-only systems
- **Missing bias terms** - ensure both weights and biases are extracted
- **Incorrect tensor ordering** - PyTorch uses different conventions than some C libraries

### Preprocessing Mismatches
- **Wrong normalization** - training might use mean/std normalization, not just /255
- **Color channel issues** - PNG might be RGBA while model expects grayscale
- **Dimension ordering** - ensure row-major vs column-major consistency

### C/C++ Implementation
- **Matrix multiplication order** - verify (input × weights^T) vs (weights × input)
- **Activation function placement** - apply after linear layer, before next layer
- **Integer vs float division** - use 255.0, not 255, for normalization

### Compilation Issues
- **Flag concatenation** - ensure spaces between compiler flags
- **Missing libraries** - include all required source files (lodepng.cpp, cJSON.c)
- **Header dependencies** - verify all headers are in include path

### Output Handling
- **Verbose library output** - suppress or redirect debug/progress output
- **Newline handling** - ensure consistent line endings in output files
- **Buffering issues** - flush stdout before program exit

## Efficiency Guidelines

- Avoid repeatedly checking package managers; identify available tools first
- Create reference outputs early to catch implementation bugs quickly
- Review complete code before compilation attempts
- Minimize status-only updates; batch related operations
- Test with multiple inputs when possible, not just the provided test case