swamp-data
List model data, view version history, delete expired versions, and run garbage collection. Use when listing data, viewing versions, cleaning up old data, or configuring data retention. Triggers on "swamp data", "model data", "data list", "data get", "data versions", "garbage collection", "gc", "data gc", "clean up data", "old data", "data retention", "data lifecycle", "version history", "data cleanup", "prune data", "expire data", "ephemeral data".
Packaged view
This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.
Install command
npx @skill-hub/cli install systeminit-swamp-swamp-data
Repository
Skill path: .claude/skills/swamp-data
List model data, view version history, delete expired versions, and run garbage collection. Use when listing data, viewing versions, cleaning up old data, or configuring data retention. Triggers on "swamp data", "model data", "data list", "data get", "data versions", "garbage collection", "gc", "data gc", "clean up data", "old data", "data retention", "data lifecycle", "version history", "data cleanup", "prune data", "expire data", "ephemeral data".
Open repositoryBest for
Primary workflow: Analyze Data & AI.
Technical facets: Full Stack, Data / AI.
Target audience: everyone.
License: Unknown.
Original source
Catalog source: SkillHub Club.
Repository owner: systeminit.
This is still a mirrored public skill entry. Review the repository before installing into production workflows.
What it helps with
- Install swamp-data into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
- Review https://github.com/systeminit/swamp before adding swamp-data to shared team environments
- Use swamp-data for development workflows
Works across
Favorites: 0.
Sub-skills: 0.
Aggregator: No.
Original source / Raw SKILL.md
---
name: swamp-data
description: List model data, view version history, delete expired versions, and run garbage collection. Use when listing data, viewing versions, cleaning up old data, or configuring data retention. Triggers on "swamp data", "model data", "data list", "data get", "data versions", "garbage collection", "gc", "data gc", "clean up data", "old data", "data retention", "data lifecycle", "version history", "data cleanup", "prune data", "expire data", "ephemeral data".
---
# Swamp Data Skill
Manage model data lifecycle through the CLI. All commands support `--json` for
machine-readable output.
**Verify CLI syntax:** If unsure about exact flags or subcommands, run
`swamp help data` for the complete, up-to-date CLI schema.
## Quick Reference
| Task | Command |
| ---------------------- | ----------------------------------------------------- |
| Search all data | `swamp data search --json` |
| Search with filters | `swamp data search --type output --since 1d --json` |
| Search by workflow | `swamp data search --workflow my-workflow --json` |
| Search by model | `swamp data search --model my-model --json` |
| Free-text search | `swamp data search vpc --json` |
| List model data | `swamp data list <model> --json` |
| List workflow data | `swamp data list --workflow <name> --json` |
| Get specific data | `swamp data get <model> <name> --json` |
| Get metadata only | `swamp data get <model> <name> --no-content --json` |
| Get data via workflow | `swamp data get --workflow <name> <data_name> --json` |
| View version history | `swamp data versions <model> <name> --json` |
| Run garbage collection | `swamp data gc --json` |
| Rename data instance | `swamp data rename <model> <old> <new>` |
| Preview GC (dry run) | `swamp data gc --dry-run --json` |
## Data Concepts
### What is Model Data?
Models produce data when methods execute. Each data item has:
- **Name**: Unique identifier within the model
- **Version**: Auto-incrementing integer (starts at 1)
- **Lifetime**: How long data persists
- **Content type**: MIME type of the data
- **Tags**: Key-value pairs for categorization (e.g., `type=resource`)
### Data Tags
Standard tags categorize data:
| Tag | Description |
| ----------------- | ----------------------------------------------- |
| `type=resource` | Structured JSON data (validated against schema) |
| `type=file` | Binary/text file artifacts (including logs) |
| `specName=<name>` | Output spec key name (for `data.findBySpec()`) |
### Lifetime Types
Data lifetime controls automatic expiration:
| Lifetime | Behavior |
| ----------- | ----------------------------------------------------- |
| `ephemeral` | Deleted after method invocation or workflow completes |
| `job` | Persists only while the creating job runs |
| `workflow` | Persists only while the creating workflow runs |
| Duration | Expires after time period (e.g., `1h`, `7d`, `1mo`) |
| `infinite` | Never expires (default for resources) |
### Version Garbage Collection
Each data item can have multiple versions. The GC setting controls version
retention:
| GC Setting | Behavior |
| ----------- | ------------------------------------- |
| Integer (N) | Keep only the latest N versions |
| Duration | Keep versions newer than the duration |
| `infinite` | Keep all versions forever |
## Search Data
Search across all models with extensive filtering options.
```bash
# All data in the repo
swamp data search --json
# Filter by type tag
swamp data search --type resource --json
# Data from last hour
swamp data search --since 1h --json
# Workflow-produced data
swamp data search --workflow test-data-fetch --json
# Model-specific data
swamp data search --model my-processor --json
# By content type
swamp data search --content-type application/json --json
# By owner type
swamp data search --owner-type workflow-step --json
# Free-text search
swamp data search vpc --json
# Filter by arbitrary tag
swamp data search --tag env=prod --json
# Multiple tags (AND logic)
swamp data search --tag env=prod --tag team=platform --json
# Combined filters (AND logic)
swamp data search --type resource --since 1d --workflow deploy --json
# Tags with other filters
swamp data search --tag env=staging --type resource --since 1d --json
# Limit results
swamp data search --limit 10 --json
```
**Search filters:**
| Filter | Description |
| ---------------- | ------------------------------------------------------- |
| `--type` | Data type tag (log, file, resource, data, output) |
| `--lifetime` | Lifetime (ephemeral, infinite, job, workflow, duration) |
| `--owner-type` | Owner type (model-method, workflow-step, manual) |
| `--workflow` | Workflow name tag |
| `--model` | Model name |
| `--content-type` | MIME content type |
| `--since` | Duration (1h, 1d, 7d, 1w, 1mo) |
| `--output` | Model output ID |
| `--run` | Workflow run ID |
| `--tag` | Arbitrary tag (KEY=VALUE, repeatable, AND logic) |
| `--streaming` | Only streaming data |
| `--limit` | Max results (default: 50) |
## List Model Data
View all data items for a model, grouped by tag type.
```bash
swamp data list my-model --json
```
**Output shape:**
```json
{
"modelId": "a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d",
"modelName": "my-model",
"modelType": "my-type",
"groups": [
{
"type": "log",
"items": [
{
"id": "uuid",
"name": "execution-log",
"version": 5,
"contentType": "text/plain",
"type": "log",
"streaming": false,
"size": 1024,
"createdAt": "2025-01-15T10:30:00Z"
}
]
},
{
"type": "resource",
"items": [
{
"id": "uuid",
"name": "state",
"version": 3,
"contentType": "application/json",
"type": "resource",
"streaming": false,
"size": 512,
"createdAt": "2025-01-15T10:30:00Z"
}
]
}
],
"total": 2
}
```
## Get Specific Data
Retrieve the latest version of a specific data item.
```bash
swamp data get my-model execution-log --json
# Metadata only (no content)
swamp data get my-model execution-log --no-content --json
```
**Output shape:**
```json
{
"id": "uuid",
"name": "execution-log",
"modelId": "a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d",
"modelName": "my-model",
"modelType": "my-type",
"version": 5,
"contentType": "text/plain",
"lifetime": "7d",
"garbageCollection": "infinite",
"streaming": false,
"tags": { "type": "resource" },
"ownerDefinition": {
"ownerType": "model-method",
"ownerRef": "my-model:create",
"definitionHash": "abc123..."
},
"createdAt": "2025-01-15T10:30:00Z",
"size": 1024,
"checksum": "sha256:...",
"contentPath": ".swamp/data/my-type/a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d/execution-log/5/raw",
"content": "..."
}
```
## Workflow-Scoped Data Access
List or get data produced by a workflow run instead of specifying a model.
```bash
# List all data from the latest run of a workflow
swamp data list --workflow test-data-fetch --json
# List data from a specific run
swamp data list --workflow test-data-fetch --run <run_id> --json
# Get specific data by name from a workflow run
swamp data get --workflow test-data-fetch output --json
# Get with specific version
swamp data get --workflow test-data-fetch output --version 2 --json
```
## View Version History
See all versions of a specific data item.
```bash
swamp data versions my-model state --json
```
**Output shape:**
```json
{
"dataName": "state",
"modelId": "a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d",
"modelName": "my-model",
"modelType": "my-type",
"versions": [
{
"version": 3,
"createdAt": "2025-01-15T10:30:00Z",
"size": 1024,
"checksum": "sha256:...",
"isLatest": true
},
{
"version": 2,
"createdAt": "2025-01-14T09:00:00Z",
"size": 980,
"checksum": "sha256:...",
"isLatest": false
},
{
"version": 1,
"createdAt": "2025-01-13T08:00:00Z",
"size": 512,
"checksum": "sha256:...",
"isLatest": false
}
],
"total": 3
}
```
## Rename Data
Data instance names are permanent once created — deleting and recreating under a
new name loses version history and breaks any workflows or expressions that
reference the old name. Use `data rename` to non-destructively rename with
backwards-compatible forwarding. The old name becomes a forward reference that
transparently resolves to the new name.
**When to rename:**
- Refactoring naming conventions (e.g., `web-vpc` → `dev-web-vpc`)
- Reorganizing data after a model's purpose evolves
- Fixing typos in data names without losing history
```bash
# Rename a data instance
swamp data rename my-model old-name new-name
# With explicit repo directory
swamp data rename my-model old-name new-name --repo-dir ./my-repo
```
**What happens:**
1. Latest version of `old-name` is copied to `new-name` (version 1)
2. A tombstone is written on `old-name` with a `renamedTo` forward reference
3. Future lookups of `old-name` transparently resolve to `new-name`
4. Historical versions of `old-name` remain accessible via
`data.version("model", "old-name", N)`
**Forward reference behavior:**
- `data.latest("model", "old-name")` → resolves to `new-name` automatically
- `data.version("model", "old-name", 2)` → returns original version 2 (no
forwarding)
- `model.<name>.resource.<spec>.<old-name>` → resolves to new name in
expressions
**Important:** After renaming, update any workflows or models that produce data
under the old name. If a model re-runs and writes to the old name, it will
overwrite the forward reference.
## Garbage Collection
Clean up expired data and old versions based on lifecycle settings.
**Preview what would be deleted:**
```bash
swamp data gc --dry-run --json
```
**Output shape:**
```json
{
"expiredDataCount": 2,
"expiredData": [
{
"type": "my-type",
"modelId": "a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d",
"dataName": "cache",
"reason": "lifetime:ephemeral"
},
{
"type": "other-type",
"modelId": "b2c3d4e5-f6a7-4b8c-9d0e-1f2a3b4c5d6e",
"dataName": "log",
"reason": "lifetime:1h"
}
]
}
```
**Run garbage collection:**
```bash
swamp data gc --json
swamp data gc -f --json # Skip confirmation prompt
```
**Output shape:**
```json
{
"dataEntriesExpired": 2,
"versionsDeleted": 2,
"bytesReclaimed": 15900000,
"dryRun": false,
"expiredEntries": [...]
}
```
## Accessing Data in Expressions
CEL expressions access model data in workflows and model inputs. Functions,
examples, and key rules are in
[references/expressions.md](references/expressions.md).
## Data Ownership
Data is owned by the creating model — see
[references/data-ownership.md](references/data-ownership.md) for owner fields,
validation rules, and viewing ownership.
## Data Storage
Data is stored in the `.swamp/data/` directory:
```
.swamp/data/{normalized-type}/{model-id}/{data-name}/
1/
raw # Actual data content
metadata.yaml # Version metadata
2/
raw
metadata.yaml
latest → 2/ # Symlink to latest version
```
## When to Use Other Skills
| Need | Use Skill |
| -------------------- | ------------------------------- |
| Create/run models | `swamp-model` |
| View model outputs | `swamp-model` (output commands) |
| Create/run workflows | `swamp-workflow` |
| Repository structure | `swamp-repo` |
| Manage secrets | `swamp-vault` |
## References
- **Examples**: See [references/examples.md](references/examples.md) for data
query patterns, CEL expressions, and GC scenarios
- **Troubleshooting**: See
[references/troubleshooting.md](references/troubleshooting.md) for common
errors and fixes
- **Data design**: See [design/models.md](design/models.md) for data lifecycle
details
- **Expressions**: See [design/expressions.md](design/expressions.md) for CEL
syntax
---
## Referenced Files
> The following files are referenced in this skill and included for context.
### references/expressions.md
```markdown
# Accessing Data in Expressions
Use CEL expressions to access model data in workflows and model inputs.
**Note:** `model.<name>.resource.<spec>` requires the model to have previously
produced data (a method was run that called `writeResource`). If no data exists
yet, accessing `.resource` will fail with "No such key". Use
`swamp data list <model-name>` to verify data exists.
```yaml
# Access latest resource data via dot notation
value: ${{ model.my-model.resource.output.main.attributes.result }}
# Access specific version
value: ${{ data.version("my-model", "main", 2).attributes.result }}
# Access file metadata
path: ${{ model.my-model.file.content.primary.path }}
size: ${{ model.my-model.file.content.primary.size }}
# Lazy-load file contents
body: ${{ file.contents("my-model", "content") }}
```
## Data Namespace Functions
| Function | Description |
| -------------------------------------------- | ----------------------------------------- |
| `data.version(modelName, dataName, version)` | Get specific version of data |
| `data.latest(modelName, dataName)` | Get latest version of data |
| `data.listVersions(modelName, dataName)` | Get array of available version numbers |
| `data.findByTag(tagKey, tagValue)` | Find all data matching a tag |
| `data.findBySpec(modelName, specName)` | Find all data from a specific output spec |
**DataRecord structure** returned by these functions:
```json
{
"id": "uuid",
"name": "data-name",
"version": 3,
"createdAt": "2025-01-15T10:30:00Z",
"attributes": {/* data content */},
"tags": { "type": "resource" }
}
```
**Example usage:**
```yaml
# Get specific version
oldValue: ${{ data.version("my-model", "state", 2).attributes.value }}
# Get latest
current: ${{ data.latest("my-model", "output").attributes.result }}
# List versions for conditional logic
hasHistory: ${{ size(data.listVersions("my-model", "state")) > 1 }}
# Find all resources across models
allResources: ${{ data.findByTag("type", "resource") }}
# Find data from a specific workflow
workflowData: ${{ data.findByTag("workflow", "my-workflow") }}
# Find all instances from a factory model's output spec
subnets: ${{ data.findBySpec("my-scanner", "subnet") }}
```
**Key rules:**
- `model.<name>.resource.<specName>.<instanceName>` — accesses the latest
version of a resource. Works both within a workflow run (in-memory updates)
and across workflow runs (persisted data).
- `model.<name>.file.<specName>.<instanceName>` — accesses file metadata (path,
size, contentType). Same behavior as resource expressions.
- `data.latest(modelName, dataName)` — reads persisted data snapshot taken at
workflow start.
- Use `data.version()` function for specific versions
- Use `data.findByTag()` to query across models
- See the `swamp-workflow` skill's
[data-chaining reference](../../swamp-workflow/references/data-chaining.md)
for detailed guidance on expression choice in workflows.
```
### references/data-ownership.md
```markdown
# Data Ownership
Data artifacts are owned by the model (definition) that created them. This
ensures data integrity and prevents accidental overwrites.
## Owner Definition
Each data item tracks its owner through the `ownerDefinition` field:
| Field | Description |
| ---------------- | -------------------------------------------- |
| `ownerType` | `model-method`, `workflow-step`, or `manual` |
| `ownerRef` | Reference to the creating entity |
| `definitionHash` | Hash of the definition at creation time |
| `workflowId` | Set when created during workflow execution |
| `workflowRunId` | Specific run that created this data |
## Ownership Validation
When a model method writes data:
1. **New data**: Created with current model as owner
2. **Existing data**: Validates `ownerDefinition.definitionHash` matches
3. **Hash mismatch**: Write fails with ownership error
This prevents scenarios where multiple models accidentally share data names.
## Viewing Ownership
Use `swamp data get` to see ownership information:
```bash
swamp data get my-model state --json
```
```json
{
"name": "state",
"version": 3,
"ownerDefinition": {
"ownerType": "model-method",
"ownerRef": "my-model:create",
"definitionHash": "abc123..."
}
}
```
```
### references/examples.md
```markdown
# Data Query and Expression Examples
## Table of Contents
- [CEL Expression Quick Reference](#cel-expression-quick-reference)
- [CEL Path Patterns by Model Type](#cel-path-patterns-by-model-type)
- [Cross-Model Data References](#cross-model-data-references)
- [Data Discovery Patterns](#data-discovery-patterns)
- [Version Management](#version-management)
- [Rename Scenarios](#rename-scenarios)
- [Garbage Collection Scenarios](#garbage-collection-scenarios)
- [Workflow Data Access](#workflow-data-access)
## CEL Expression Quick Reference
| Expression Pattern | Description |
| ------------------------------------------------------------ | ------------------------------------------- |
| `model.<name>.resource.<spec>.<instance>.attributes.<field>` | Cross-model resource reference (PREFERRED) |
| `model.<name>.resource.result.result.attributes.stdout` | command/shell stdout |
| `model.<name>.file.<spec>.<instance>.path` | File path reference |
| `self.name` | Current model's name |
| `inputs.<name>` | Workflow or model runtime input |
| `env.<VAR_NAME>` | Environment variable |
| `vault.get("<vault-name>", "<key>")` | Vault secret |
| `data.version("<model>", "<name>", <version>)` | Specific version of data |
| `data.latest("<model>", "<name>")` | Latest version (snapshot at workflow start) |
| `data.findByTag("<key>", "<value>")` | Find data by tag |
| `data.findBySpec("<model>", "<spec>")` | Find all instances from a spec |
## CEL Path Patterns by Model Type
| Model Type | CEL Path | Notes |
| --------------- | -------------------------------------------------------------------- | ---------------------------------- |
| `command/shell` | `model.<name>.resource.result.result.attributes.stdout` | Built-in uses `result` for both |
| `@user/custom` | `model.<name>.resource.<spec>.<instance>.attributes.<field>` | You choose both names |
| Factory models | `model.<name>.resource.<spec>.<dynamic-instance>.attributes.<field>` | Instance varies (e.g., `vpc-1234`) |
## Cross-Model Data References
### Preferred: model.* Expressions
Always prefer `model.*` expressions over `data.latest()` for referencing other
models' data. The `model.*` expression provides:
- In-memory updates during workflow execution
- Type-safe attribute access
- Clear dependency tracking
```yaml
# PREFERRED: Cross-model resource reference
globalArguments:
vpcId: ${{ model.my-vpc.resource.vpc.main.attributes.VpcId }}
subnetId: ${{ model.public-subnet.resource.subnet.primary.attributes.SubnetId }}
# For command/shell models
globalArguments:
imageId: ${{ model.ami-lookup.resource.result.result.attributes.stdout }}
```
### When to Use data.latest()
Use `data.latest()` only when you need:
- A snapshot from workflow start (not live updates)
- Dynamic model name lookup
```yaml
# Snapshot at workflow start
oldValue: ${{ data.latest("my-model", "state").attributes.value }}
# Rollback scenario — get previous version
previousConfig: ${{ data.version("my-model", "config", 1).attributes.setting }}
```
## Data Discovery Patterns
### Find All Resources by Tag
```yaml
# Find all data with type=resource tag
allResources: ${{ data.findByTag("type", "resource") }}
# Find all data from a specific workflow
workflowData: ${{ data.findByTag("workflow", "deploy-workflow") }}
# Find all data with custom tag
prodResources: ${{ data.findByTag("env", "production") }}
```
### Find Factory Model Instances
Factory models produce multiple instances from a single output spec. Use
`data.findBySpec()` to discover all instances:
```yaml
# Get all subnets discovered by a scanner model
subnets: ${{ data.findBySpec("subnet-scanner", "subnet") }}
# Iterate over discovered instances in forEach
jobs:
- name: process-subnets
steps:
- name: tag-subnet-${{ self.subnetId }}
forEach:
item: subnet
in: ${{ data.findBySpec("subnet-scanner", "subnet") }}
task:
type: model_method
modelIdOrName: subnet-tagger
methodName: tag
inputs:
subnetId: ${{ self.subnet.attributes.subnetId }}
```
### Access Specific Named Instance
```yaml
# Known instance name from factory model
subnetA: ${{ model.subnet-scanner.resource.subnet.subnet-aaa.attributes.cidr }}
# Single-instance model — use descriptive instance name
vpcId: ${{ model.my-vpc.resource.vpc.main.attributes.VpcId }}
```
## Version Management
### Get Specific Version
```yaml
# Get version 2 of a model's data
previousState: ${{ data.version("my-model", "state", 2).attributes.value }}
# Use in rollback workflow
jobs:
- name: rollback
steps:
- name: restore-config
task:
type: model_method
modelIdOrName: config-restore
methodName: restore
inputs:
config: ${{ data.version("app-config", "config", 1).attributes }}
```
### Check Version History
```bash
# List all versions
swamp data versions my-model state --json
# Check if multiple versions exist
hasHistory: ${{ size(data.listVersions("my-model", "state")) > 1 }}
```
### Version-Aware Conditional Logic
```yaml
# Only rollback if we have a previous version
jobs:
- name: maybe-rollback
steps:
- name: check-versions
condition: ${{ size(data.listVersions("app-config", "config")) > 1 }}
task:
type: model_method
modelIdOrName: app-config
methodName: rollback
```
## Rename Scenarios
### Basic Rename
```bash
# Rename a data instance
swamp data rename my-vpc web-vpc dev-web-vpc
# Output:
# Renamed "web-vpc" -> "dev-web-vpc" for my-vpc (aws/vpc)
# Version 3 copied as v1 under new name
# Old name "web-vpc" now forwards to "dev-web-vpc"
# WARNING: Any workflows or models that produce data under "web-vpc"
# will overwrite the forward reference. Update them to use
# "dev-web-vpc" instead.
```
### Verify Forward Reference Works
```bash
# Old name transparently resolves to new name
swamp data get my-vpc web-vpc --json
# Returns the data under "dev-web-vpc"
# Historical versions still accessible
swamp data versions my-vpc web-vpc --json
# Shows old versions before the rename
```
### After Rename: Update References
After renaming, update workflows and model inputs to use the new name:
```yaml
# Before rename
globalArguments:
vpcId: ${{ model.my-vpc.resource.vpc.web-vpc.attributes.VpcId }}
# After rename — update to new name
globalArguments:
vpcId: ${{ model.my-vpc.resource.vpc.dev-web-vpc.attributes.VpcId }}
```
The old expression still works via forward reference, but updating is
recommended to avoid surprises if the model re-runs and overwrites the forward
reference.
## Garbage Collection Scenarios
### Preview GC Impact
```bash
# Always preview before running GC
swamp data gc --dry-run --json
```
**Output shows:**
- `expiredDataCount` — count of expired data items
- `expiredData` — data items past their lifetime (with type, modelId, dataName,
reason)
### Configure Version Retention
In your model's resource spec:
```typescript
resources: {
"state": {
description: "Application state",
schema: StateSchema,
lifetime: "infinite", // Never expire
garbageCollection: 10, // Keep last 10 versions
},
"log": {
description: "Execution log",
schema: LogSchema,
lifetime: "7d", // Expire after 7 days
garbageCollection: 5, // Keep last 5 versions
},
},
```
### GC Policy Reference
| Lifetime | GC Setting | Behavior |
| ----------- | ---------- | --------------------------------------------- |
| `infinite` | 10 | Keep forever, but only last 10 versions |
| `7d` | 5 | Delete after 7 days, keep last 5 versions |
| `ephemeral` | 1 | Delete after method completes, keep 1 version |
| `workflow` | 3 | Delete when workflow ends, keep 3 versions |
### Run GC After Cleanup Workflows
```yaml
# After running a delete workflow, clean up orphaned data
jobs:
- name: cleanup
steps:
- name: delete-resources
task:
type: workflow
workflowIdOrName: delete-all-resources
- name: run-gc
task:
type: model_method
modelIdOrName: gc-runner
methodName: gc
```
## Workflow Data Access
### Access Data from Latest Workflow Run
```bash
# List data from the latest run of a workflow
swamp data list --workflow deploy-workflow --json
# Get specific data artifact
swamp data get --workflow deploy-workflow deployment-state --json
```
### Cross-Workflow Data References
Parent workflow creates resources, sub-workflow uses them:
```yaml
# Parent workflow (create-networking)
jobs:
- name: create
steps:
- name: create-vpc
task:
type: model_method
modelIdOrName: networking-vpc
methodName: create
- name: tag
dependsOn:
- job: create
condition:
type: succeeded
steps:
- name: tag-resources
task:
type: workflow
workflowIdOrName: tag-networking
```
```yaml
# Model used by sub-workflow (tag-networking)
name: tag-vpc
globalArguments:
resourceId: ${{ model.networking-vpc.resource.vpc.main.attributes.VpcId }}
tagKey: ManagedBy
tagValue: Swamp
```
### Search Workflow-Produced Data
```bash
# Find all data created by a specific workflow
swamp data search --workflow deploy-workflow --json
# Filter by type within workflow data
swamp data search --workflow deploy-workflow --type resource --json
# Combine with time filter
swamp data search --workflow deploy-workflow --since 1d --json
```
```
### references/troubleshooting.md
```markdown
# Data Troubleshooting
## Table of Contents
- [Common Errors](#common-errors)
- ["No such key: resource" in CEL Expressions](#no-such-key-resource-in-cel-expressions)
- ["No data found for model"](#no-data-found-for-model)
- ["Data ownership validation failed"](#data-ownership-validation-failed)
- ["Version not found"](#version-not-found)
- ["GC deleted data I needed"](#gc-deleted-data-i-needed)
- [Rename Issues](#rename-issues)
- [Expression Debugging](#expression-debugging)
- [Data Recovery](#data-recovery)
## Common Errors
### "No such key: resource" in CEL Expressions
**Symptom**: Expression `model.<name>.resource.<spec>` fails with "No such key:
resource"
**Causes**:
1. **Model never executed** — the `resource` key only exists after a method
writes data
2. **Wrong spec or instance name** — the path must match exactly what
`writeResource(specName, instanceName, data)` used
3. **Spec name contains hyphens** — CEL interprets `-` as subtraction
**Solutions**:
```bash
# 1. Verify data exists
swamp data list <model-name> --json
# 2. Check exact spec and instance names
swamp data get <model-name> <data-name> --json
# 3. Run the create method first
swamp model method run <model-name> create --json
```
**CEL path anatomy**:
```
model.my-vpc.resource.vpc.main.attributes.VpcId
────── ─── ───
model name | └── instanceName (from writeResource 2nd arg)
└── specName (from writeResource 1st arg)
```
### "No data found for model"
**Symptom**: `swamp data list <model>` returns empty or error
**Causes**:
1. Model exists but no method has been run yet
2. Model was deleted
3. Wrong model name (typo or case mismatch)
**Solutions**:
```bash
# List all models to verify name
swamp model search --json
# Run a method to produce data
swamp model method run <model-name> <method> --json
# Check if model input exists
swamp model get <model-name> --json
```
### "Data ownership validation failed"
**Symptom**:
`Error: Data ownership validation failed - definition hash mismatch`
**Causes**:
1. A different model with the same data name is trying to overwrite data
2. Model definition changed after data was created
3. Manual data manipulation
**Solutions**:
```bash
# View ownership information
swamp data get <model-name> <data-name> --json
# Check the ownerDefinition.definitionHash in output
# Compare with current model's hash
# If intentional: delete the old data first
swamp model delete <old-model-name> --json
```
**Prevention**: Use unique, model-specific data names. Avoid generic names like
"output" or "state" across multiple models.
### "Version not found"
**Symptom**: `data.version("model", "name", N)` returns null or error
**Causes**:
1. Version number never existed
2. GC deleted the version
3. Typo in model or data name
**Solutions**:
```bash
# List all versions
swamp data versions <model-name> <data-name> --json
# Check GC settings
swamp data get <model-name> <data-name> --json
# Look at garbageCollection field
```
### "GC deleted data I needed"
**Symptom**: Data that was previously accessible is now gone
**Causes**:
1. Lifetime expired (e.g., `7d`, `ephemeral`)
2. GC setting pruned old versions
3. Manual GC run
**Solutions**:
1. **Prevent future issues** — adjust model's resource spec:
```typescript
resources: {
"state": {
schema: StateSchema,
lifetime: "infinite", // Never auto-expire
garbageCollection: 20, // Keep more versions
},
},
```
2. **Always preview GC** before running:
```bash
swamp data gc --dry-run --json
```
3. **Restore from backup** if available (`.swamp/data/` directory)
## Rename Issues
### "Old name and new name must be different"
**Symptom**: `swamp data rename` fails with this error
**Solution**: Provide a different new name. The old and new names cannot be
identical.
### Forward reference not resolving after rename
**Symptom**: `swamp data get model old-name` returns null after rename
**Causes**:
1. A model re-ran and wrote new data to the old name, overwriting the forward
reference tombstone
2. The rename chain is too deep (more than 5 levels)
**Solutions**:
```bash
# Check if the forward reference still exists
swamp data versions <model> <old-name> --json
# Look for the latest version — it should have lifecycle: "deleted" and renamedTo set
# If overwritten, re-run the rename
swamp data rename <model> <old-name> <new-name>
```
### Data appears twice in list after rename
**Symptom**: `swamp data list` shows the same data under both old and new names
**Solution**: This is a bug — the deduplication logic should prevent this. File
a bug report with `swamp issue`.
### "Model not found" during rename
**Symptom**: `swamp data rename <model> old new` fails with "Model not found"
**Solution**: The first argument must be a valid model ID or name. Check
available models:
```bash
swamp model search --json
```
## Expression Debugging
### Step 1: Verify Data Exists
```bash
# Check if model has any data
swamp data list <model-name> --json
# Get specific data item
swamp data get <model-name> <data-name> --json
```
### Step 2: Verify Path Components
For expression `model.my-vpc.resource.vpc.main.attributes.VpcId`:
| Component | Check Command |
| --------- | ------------------------------------------------------- |
| `my-vpc` | `swamp model get my-vpc --json` |
| `vpc` | `swamp data list my-vpc --json` (check specName in tag) |
| `VpcId` | `swamp data get my-vpc vpc --json` (check attributes) |
### Step 3: Validate Model Definition
```bash
swamp model validate <model-name> --json
```
Look for expression validation errors in the output.
### Step 4: Test Expression in Isolation
Create a test model that uses the expression and run validate:
```yaml
name: test-expression
globalArguments:
testValue: ${{ model.my-vpc.resource.vpc.main.attributes.VpcId }}
```
```bash
swamp model validate test-expression --json
```
## Data Recovery
### From Version History
If data was overwritten but older versions exist:
```bash
# List versions
swamp data versions <model-name> <data-name> --json
# Access specific version in CEL
oldValue: ${{ data.version("model-name", "data-name", 1).attributes.field }}
```
### From Workflow Run
If data was produced by a workflow:
```bash
# Find the workflow run
swamp workflow history search --json
# List data from that run
swamp data list --workflow <workflow-name> --run <run-id> --json
```
### From File System
Data is stored in `.swamp/data/`. Structure:
```
.swamp/data/{normalized-type}/{model-id}/{data-name}/
1/raw # Version 1 content
1/metadata.yaml # Version 1 metadata
2/raw # Version 2 content
2/metadata.yaml # Version 2 metadata
latest → 2/ # Symlink to latest
```
## Data Not Appearing After Method Run
**Symptom**: Method succeeded but `swamp data list` doesn't show new data
**Solutions**:
1. **Check method output** — verify dataHandles were returned:
```bash
swamp model output get <model-name> --json
# Look for artifacts in output
```
2. **Check data directory directly**:
```bash
ls -la .swamp/data/<type>/<model-id>/
```
```