Back to skills
SkillHub ClubAnalyze Data & AIFull StackData / AI

long-running-agent

Framework for building AI agents that work effectively across multiple context windows on complex, long-running tasks. Use when building agents for multi-hour/multi-day projects, implementing persistent coding workflows, creating systems that need state management across sessions, or when an agent needs to make incremental progress on large codebases. Provides initializer and coding agent patterns, progress tracking, feature management, and session handoff strategies.

Packaged view

This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.

Stars
1
Hot score
77
Updated
March 20, 2026
Overall rating
C1.2
Composite score
1.2
Best-practice grade
S96.0

Install command

npx @skill-hub/cli install refly-ai-skill-to-workflow-long-running-agent
ai-agentsworkflow-automationstate-managementcoding-framework

Repository

refly-ai/skill-to-workflow

Skill path: skills-source/zijiebijiben/long-running-agent

Framework for building AI agents that work effectively across multiple context windows on complex, long-running tasks. Use when building agents for multi-hour/multi-day projects, implementing persistent coding workflows, creating systems that need state management across sessions, or when an agent needs to make incremental progress on large codebases. Provides initializer and coding agent patterns, progress tracking, feature management, and session handoff strategies.

Open repository

Best for

Primary workflow: Analyze Data & AI.

Technical facets: Full Stack, Data / AI.

Target audience: everyone.

License: Unknown.

Original source

Catalog source: SkillHub Club.

Repository owner: refly-ai.

This is still a mirrored public skill entry. Review the repository before installing into production workflows.

What it helps with

  • Install long-running-agent into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
  • Review https://github.com/refly-ai/skill-to-workflow before adding long-running-agent to shared team environments
  • Use long-running-agent for development workflows

Works across

Claude CodeCodex CLIGemini CLIOpenCode

Favorites: 0.

Sub-skills: 0.

Aggregator: No.

Original source / Raw SKILL.md

---
name: long-running-agent
description: Framework for building AI agents that work effectively across multiple context windows on complex, long-running tasks. Use when building agents for multi-hour/multi-day projects, implementing persistent coding workflows, creating systems that need state management across sessions, or when an agent needs to make incremental progress on large codebases. Provides initializer and coding agent patterns, progress tracking, feature management, and session handoff strategies.
---

# Long-Running Agent Framework

Framework for enabling AI agents to work effectively across many context windows on complex tasks.

## Core Problem

Long-running agents must work in discrete sessions where each new session begins with no memory of previous work. Without proper scaffolding, agents tend to:

1. **One-shot attempts** - Try to complete everything at once, running out of context mid-implementation
2. **Premature completion** - See partial progress and declare the job done
3. **Undocumented states** - Leave code in broken or undocumented states between sessions

## Two-Agent Solution

### 1. Initializer Agent (First Session Only)

Sets up the environment with all context future agents need:

- Create `init.sh` script for environment setup
- Generate comprehensive `feature_list.json` with all requirements
- Initialize `claude-progress.txt` for session logging
- Make initial git commit

See [references/initializer-prompt.md](references/initializer-prompt.md) for the full prompt template.

### 2. Coding Agent (Every Subsequent Session)

Makes incremental progress while maintaining clean state:

- Read progress files and git logs to get bearings
- Run basic tests to verify working state
- Work on ONE feature at a time
- Test end-to-end before marking complete
- Commit progress with descriptive messages
- Update progress file

See [references/coding-prompt.md](references/coding-prompt.md) for the full prompt template.

## Session Startup Sequence

Every coding agent session should begin:

```
1. pwd                              # Understand working directory
2. cat claude-progress.txt          # Read recent progress
3. cat feature_list.json            # Check feature status
4. git log --oneline -20            # Review recent commits
5. ./init.sh                        # Start dev environment
6. <run basic test>                 # Verify app works
7. <select next feature>            # Choose one failing feature
```

## Key Files

### feature_list.json

Comprehensive list of all features with pass/fail status. Use JSON format to prevent inappropriate edits.

```json
{
  "features": [
    {
      "category": "functional",
      "description": "User can create new chat",
      "steps": ["Navigate to main", "Click New Chat", "Verify creation"],
      "passes": false
    }
  ]
}
```

Template: [assets/feature_list_template.json](assets/feature_list_template.json)

### claude-progress.txt

Session-by-session log of work completed. Each entry includes:

- Session timestamp
- Features worked on
- Changes made
- Current state
- Next steps

Template: [assets/progress_template.md](assets/progress_template.md)

### init.sh

Environment setup script that:

- Installs dependencies
- Starts development servers
- Sets up any required services

## Critical Rules

### For Feature List

- Never remove or edit test descriptions
- Only change `passes` field status
- Mark as passing ONLY after end-to-end verification

### For Progress Tracking

- Always commit before session end
- Write descriptive commit messages
- Update progress file with summary
- Leave environment in mergeable state

### For Testing

- Use browser automation for web apps (Puppeteer MCP)
- Test as a human user would
- Verify end-to-end, not just unit tests
- Document any known limitations

## Common Failure Modes & Solutions

| Problem | Solution |
|---------|----------|
| Agent one-shots entire project | Create detailed feature list, work one at a time |
| Declares victory too early | Check feature_list.json for failing tests |
| Leaves broken state | Run basic test at session start, fix first |
| Marks features done prematurely | Require end-to-end browser testing |
| Wastes time figuring out setup | Read init.sh, use established patterns |

## Adapting to Other Domains

This framework generalizes beyond web development. Key principles:

1. **Comprehensive task decomposition** - Break work into testable units
2. **Progress persistence** - Maintain state across sessions
3. **Incremental verification** - Test after each change
4. **Clean handoffs** - Leave work in resumable state


---

## Referenced Files

> The following files are referenced in this skill and included for context.

### references/initializer-prompt.md

```markdown
# Initializer Agent Prompt Template

Use this prompt for the FIRST session only. It sets up the environment for all subsequent coding sessions.

## Prompt

```
You are an AI agent tasked with setting up a software project for long-term development across multiple sessions.

PROJECT GOAL: {user_goal}

Your job in THIS SESSION is to set up the project foundation. Do NOT implement features yet.

## Required Setup Tasks

### 1. Create init.sh
Write a shell script that:
- Installs all dependencies
- Starts the development server
- Sets up any required services (databases, etc.)
- Can be run by future agents to quickly get the environment ready

### 2. Create feature_list.json
Based on the project goal, create a comprehensive JSON file listing ALL features needed for a complete implementation. Each feature should:
- Have a clear, testable description
- Include step-by-step verification criteria
- Be marked as "passes": false initially
- Be granular enough to complete in one session

Aim for 50-200+ features depending on project complexity. Be thorough - missing features means incomplete implementation.

Example structure:
{
  "project": "{project_name}",
  "features": [
    {
      "id": 1,
      "category": "core",
      "description": "Feature description",
      "steps": ["Step 1", "Step 2", "Step 3"],
      "passes": false,
      "priority": "high"
    }
  ]
}

### 3. Create claude-progress.txt
Initialize the progress tracking file with:
- Project overview
- Session 1 summary (this session)
- Environment setup notes
- Instructions for next agent

### 4. Initialize Git Repository
- git init
- Create .gitignore appropriate for the project
- Make initial commit with all setup files

### 5. Create Basic Project Structure
Set up the initial file/folder structure appropriate for the project type, but do NOT implement actual features.

## Rules
- Focus on setup, not implementation
- Be comprehensive in feature_list.json - this drives all future work
- Test that init.sh actually works
- Leave clear documentation for the next session
```

## Customization Points

Replace `{user_goal}` with the actual project description.

Adjust feature count expectations based on project complexity:
- Simple utility: 20-50 features
- Standard web app: 100-200 features
- Complex application: 200+ features

```

### references/coding-prompt.md

```markdown
# Coding Agent Prompt Template

Use this prompt for ALL sessions after the initial setup.

## Prompt

```
You are an AI agent continuing work on an existing software project. The project has been set up with a feature list and progress tracking system.

## Session Startup (ALWAYS do these first)

1. Run `pwd` to confirm working directory
2. Read `claude-progress.txt` to understand recent work
3. Read `feature_list.json` to see current feature status
4. Run `git log --oneline -20` to see recent commits
5. Run `./init.sh` to start the development environment
6. Run a basic smoke test to verify the app is working

## Core Rules

### Work Incrementally
- Choose ONE feature from feature_list.json that has "passes": false
- Prioritize by: dependencies first, then priority field, then id order
- Complete and verify that feature before moving to the next
- Never try to implement multiple features simultaneously

### Test Thoroughly
- Test as a user would, not just with unit tests
- For web apps: use browser automation to verify end-to-end
- Only mark a feature as "passes": true after full verification
- If you cannot fully verify, leave it as false with notes

### Maintain Clean State
- Commit after completing each feature
- Use descriptive commit messages: "feat: implement {feature description}"
- Never leave the codebase in a broken state
- If you break something, fix it before session end

### Update Progress Tracking
At session end, update claude-progress.txt with:
- Session number and timestamp
- Features completed (with IDs)
- Features attempted but not completed (with reasons)
- Current state of the project
- Recommended next steps

### Feature List Rules
- NEVER delete or modify feature descriptions
- NEVER remove features from the list
- ONLY change the "passes" field from false to true
- It is unacceptable to remove or edit tests because this could lead to missing or buggy functionality

## Session End Checklist
Before ending your session:
- [ ] All work is committed to git
- [ ] claude-progress.txt is updated
- [ ] feature_list.json passes fields are accurate
- [ ] App is in a working state (no obvious bugs)
- [ ] Next agent can immediately start working

## Handling Problems

### If app is broken at session start:
1. Check git log for recent changes
2. Try `git diff` to see uncommitted changes
3. Consider `git stash` or `git checkout` to restore working state
4. Fix the issue before implementing new features

### If a feature is too complex:
1. Note this in claude-progress.txt
2. Break it into smaller sub-tasks if possible
3. Implement what you can, document what remains

### If you're unsure about implementation:
1. Check existing code patterns in the project
2. Prioritize consistency with existing code
3. Document your decisions in commit messages
```

## Usage Notes

This prompt assumes the initializer agent has already run. The coding agent should find:
- `init.sh` - Environment setup script
- `feature_list.json` - Complete feature list
- `claude-progress.txt` - Progress tracking file
- Git repository with initial commit

If any of these are missing, the initializer agent needs to run first.

```

### assets/feature_list_template.json

```json
{
  "project": "PROJECT_NAME",
  "created": "YYYY-MM-DD",
  "total_features": 0,
  "completed_features": 0,
  "features": [
    {
      "id": 1,
      "category": "core",
      "description": "Example: User can perform basic action",
      "steps": [
        "Navigate to the relevant page",
        "Perform the action",
        "Verify the expected result"
      ],
      "passes": false,
      "priority": "high",
      "notes": ""
    },
    {
      "id": 2,
      "category": "functional",
      "description": "Example: System handles edge case",
      "steps": [
        "Set up the edge case condition",
        "Trigger the functionality",
        "Verify graceful handling"
      ],
      "passes": false,
      "priority": "medium",
      "notes": ""
    }
  ]
}

```

### assets/progress_template.md

```markdown
# Project Progress Log

## Project Overview
- **Project**: PROJECT_NAME
- **Goal**: Brief description of project goal
- **Started**: YYYY-MM-DD

---

## Session 1 - YYYY-MM-DD HH:MM

### Work Completed
- Initialized project structure
- Created feature_list.json with X features
- Set up init.sh for environment management
- Made initial git commit

### Current State
- Project is set up and ready for feature implementation
- All features marked as "passes": false
- Development environment verified working

### Next Steps
- Begin with feature #1: [description]
- Priority: [high/medium/low] features first

---

## Session 2 - YYYY-MM-DD HH:MM

### Work Completed
- Feature #X: [description] - COMPLETED
- Feature #Y: [description] - IN PROGRESS (reason)

### Issues Encountered
- [Any problems and how they were resolved]

### Current State
- X/Y features completed
- App is in working state
- [Any known issues]

### Next Steps
- Continue with feature #Z
- [Any specific recommendations]

---

<!-- Template for new sessions:

## Session N - YYYY-MM-DD HH:MM

### Work Completed
- Feature #X: [description] - COMPLETED/IN PROGRESS

### Issues Encountered
- [problems and resolutions]

### Current State
- X/Y features completed
- [app state]

### Next Steps
- [recommendations]

-->

```

long-running-agent | SkillHub