Back to skills
SkillHub ClubShip Full StackFull Stack

incident-response

Triage and manage production incidents. Trigger with "we have an incident", "production is down", "something is broken", "there's an outage", "SEV1", or when the user describes a production issue needing immediate response.

Packaged view

This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.

Stars
9,954
Hot score
99
Updated
March 20, 2026
Overall rating
C4.5
Composite score
4.5
Best-practice grade
B77.6

Install command

npx @skill-hub/cli install anthropics-knowledge-work-plugins-incident-response

Repository

anthropics/knowledge-work-plugins

Skill path: engineering/skills/incident-response

Triage and manage production incidents. Trigger with "we have an incident", "production is down", "something is broken", "there's an outage", "SEV1", or when the user describes a production issue needing immediate response.

Open repository

Best for

Primary workflow: Ship Full Stack.

Technical facets: Full Stack.

Target audience: everyone.

License: Unknown.

Original source

Catalog source: SkillHub Club.

Repository owner: anthropics.

This is still a mirrored public skill entry. Review the repository before installing into production workflows.

What it helps with

  • Install incident-response into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
  • Review https://github.com/anthropics/knowledge-work-plugins before adding incident-response to shared team environments
  • Use incident-response for development workflows

Works across

Claude CodeCodex CLIGemini CLIOpenCode

Favorites: 0.

Sub-skills: 0.

Aggregator: No.

Original source / Raw SKILL.md

---
name: incident-response
description: Triage and manage production incidents. Trigger with "we have an incident", "production is down", "something is broken", "there's an outage", "SEV1", or when the user describes a production issue needing immediate response.
---

# Incident Response

Guide incident response from detection through resolution and postmortem.

## Severity Classification

| Level | Criteria | Response Time |
|-------|----------|---------------|
| SEV1 | Service down, all users affected | Immediate, all-hands |
| SEV2 | Major feature degraded, many users affected | Within 15 min |
| SEV3 | Minor feature issue, some users affected | Within 1 hour |
| SEV4 | Cosmetic or low-impact issue | Next business day |

## Response Framework

1. **Triage**: Classify severity, identify scope, assign incident commander
2. **Communicate**: Status page, internal updates, customer comms if needed
3. **Mitigate**: Stop the bleeding first, root cause later
4. **Resolve**: Implement fix, verify, confirm resolution
5. **Postmortem**: Blameless review, 5 whys, action items

## Communication Templates

Provide clear, factual updates at regular cadence. Include: what's happening, who's affected, what we're doing, when the next update is.

## Postmortem Format

Blameless. Focus on systems and processes. Include timeline, root cause analysis (5 whys), what went well, what went poorly, and action items with owners and due dates.