Back to skills
SkillHub ClubShip Full StackFull StackTestingIntegration

test-desiderata

Analyze and improve test code quality using Kent Beck's Test Desiderata framework. Use when analyzing test files, reviewing test code, identifying test quality issues, suggesting test improvements, or when asked to evaluate tests against best practices. Applies to unit tests, integration tests, and any automated test code.

Packaged view

This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.

Stars
11
Hot score
85
Updated
March 20, 2026
Overall rating
C1.4
Composite score
1.4
Best-practice grade
B81.2

Install command

npx @skill-hub/cli install eferro-augmentedcode-skills-test-desiderata

Repository

eferro/augmentedcode-skills

Skill path: test-desiderata

Analyze and improve test code quality using Kent Beck's Test Desiderata framework. Use when analyzing test files, reviewing test code, identifying test quality issues, suggesting test improvements, or when asked to evaluate tests against best practices. Applies to unit tests, integration tests, and any automated test code.

Open repository

Best for

Primary workflow: Ship Full Stack.

Technical facets: Full Stack, Testing, Integration.

Target audience: everyone.

License: Unknown.

Original source

Catalog source: SkillHub Club.

Repository owner: eferro.

This is still a mirrored public skill entry. Review the repository before installing into production workflows.

What it helps with

  • Install test-desiderata into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
  • Review https://github.com/eferro/augmentedcode-skills before adding test-desiderata to shared team environments
  • Use test-desiderata for development workflows

Works across

Claude CodeCodex CLIGemini CLIOpenCode

Favorites: 0.

Sub-skills: 0.

Aggregator: No.

Original source / Raw SKILL.md

---
name: test-desiderata
description: Analyze and improve test code quality using Kent Beck's Test Desiderata framework. Use when analyzing test files, reviewing test code, identifying test quality issues, suggesting test improvements, or when asked to evaluate tests against best practices. Applies to unit tests, integration tests, and any automated test code.
---

# Test Desiderata

Analyze and improve tests using Kent Beck's Test Desiderata framework - 12 properties that make tests more valuable.

**Attribution:** All Test Desiderata concepts and principles are created by Kent Beck. Original content: https://testdesiderata.com/ and https://medium.com/@kentbeck_7670/test-desiderata-94150638a4b3

## Analysis Workflow

When analyzing tests:

1. **Read the test code** - Understand what's being tested and how
2. **Evaluate against principles** - Assess each relevant Test Desiderata property
3. **Identify tradeoffs** - Note where properties conflict or support each other
4. **Prioritize improvements** - Focus on high-impact issues first
5. **Suggest specific changes** - Provide concrete, actionable recommendations

## The 12 Test Desiderata Properties

These properties make tests more valuable. Some support each other, some interfere, and sometimes properties only seem to interfere (that's where design improvements help).

### 1. Isolated
Tests return the same results regardless of execution order. Tests don't depend on shared state, previous test results, or external ordering.

**Issues to detect:**
- Shared mutable state between tests
- Tests that must run in specific order
- Setup/teardown that affects other tests
- Database state dependencies

### 2. Composable
Test different dimensions of variability separately and combine results. Break complex scenarios into independent, reusable test components.

**Issues to detect:**
- Monolithic tests covering multiple concerns
- Inability to test dimensions independently
- Duplicated test setup across related tests
- Tests that can't be combined or reused

### 3. Deterministic
If nothing changes, test results don't change. No randomness, timing dependencies, or environmental variations.

**Issues to detect:**
- Random data generation
- Time-dependent assertions
- Flaky tests that pass/fail intermittently
- Network or external service dependencies

### 4. Fast
Tests run quickly, enabling frequent execution during development.

**Issues to detect:**
- Slow database operations
- Unnecessary sleep/wait calls
- Heavy file I/O
- External service calls
- Inefficient test data setup

### 5. Writable
Tests are cheap to write relative to the code being tested. Low friction for adding new tests.

**Issues to detect:**
- Excessive boilerplate
- Complex test setup
- Hard-to-understand test frameworks
- Difficult mocking/stubbing

### 6. Readable
Tests are comprehensible and invoke the motivation for writing them. Clear intent and behavior.

**Issues to detect:**
- Unclear test names
- Complex assertions without explanation
- Missing context about "why"
- Obscure test data
- Poor structure (Arrange-Act-Assert)

### 7. Behavioral
Tests are sensitive to behavior changes. If behavior changes, test results change.

**Issues to detect:**
- Tests that pass despite broken functionality
- Assertions that check implementation details only
- Insufficient coverage of edge cases
- Missing assertions on outcomes

### 8. Structure-insensitive
Tests don't change when code structure changes (refactoring doesn't break tests).

**Issues to detect:**
- Tests coupled to internal implementation
- Mocking private methods
- Assertions on internal state
- Tests breaking during refactoring despite unchanged behavior

### 9. Automated
Tests run without human intervention. No manual steps or verification required.

**Issues to detect:**
- Manual verification steps
- Console output requiring human inspection
- Interactive prompts
- Manual data setup

### 10. Specific
When tests fail, the cause is obvious. Failures point directly to the problem.

**Issues to detect:**
- Generic error messages
- Multiple assertions per test
- Tests covering too much functionality
- Unclear failure output

### 11. Predictive
If all tests pass, code is suitable for production. Tests catch issues before deployment.

**Issues to detect:**
- Missing critical scenarios
- Insufficient integration testing
- Gaps in error handling coverage
- Production-only configurations not tested

### 12. Inspiring
Passing tests inspire confidence in the system. Comprehensive coverage of important behaviors.

**Issues to detect:**
- Trivial tests that don't verify meaningful behavior
- Low coverage of critical paths
- Missing tests for known edge cases
- Tests that don't reflect real usage

## Analyzing Tradeoffs

Kent Beck's key insight: properties can support, interfere, or only seem to interfere with each other.

**Supporting properties:**
- Isolated + Deterministic → More reliable tests
- Fast + Automated → More frequent execution
- Readable + Specific → Easier debugging

**Interfering properties:**
- Predictive + Fast → Comprehensive tests are often slower
- Fast + Isolated → Complete isolation may require more setup
- Writable + Predictive → Simple tests may not catch all issues

**Only seeming to interfere (design opportunities):**
- Use Composable to make tests both Fast AND Predictive
- Break monolithic tests into focused ones (Specific + Fast)
- Smart test fixtures enable Writable + Isolated

## Prioritizing Improvements

Focus improvements on:

1. **Safety issues** - Fix Isolated and Deterministic first (flaky tests erode trust)
2. **Feedback loop** - Improve Fast to enable frequent testing
3. **Maintainability** - Enhance Readable and Structure-insensitive for long-term health
4. **Confidence** - Strengthen Predictive and Inspiring for production readiness

Not all properties need perfect scores. Optimize for the tradeoffs that matter most for the specific codebase and team.

## Providing Recommendations

When suggesting improvements:

1. **Be specific** - Point to exact code locations
2. **Explain the principle** - Reference which Test Desiderata property is violated
3. **Show the impact** - Describe why it matters
4. **Suggest concrete fixes** - Provide actionable code examples
5. **Note tradeoffs** - Acknowledge when improvements conflict with other properties

Example format:
```
Issue: Test "test_user_creation" violates Isolated property
Location: Line 45 - shares database connection across tests
Impact: Test results depend on execution order, causing intermittent failures
Fix: Use fresh database connection per test with proper cleanup
Tradeoff: Slightly slower but much more reliable
```

## Additional Resources

For detailed examples and patterns, see [reference.md](reference.md)

For Kent Beck's original content:
- Website: https://testdesiderata.com/
- Original essay: https://medium.com/@kentbeck_7670/test-desiderata-94150638a4b3
- Video series: Each property has a 5-minute explanatory video on YouTube


---

## Referenced Files

> The following files are referenced in this skill and included for context.

### reference.md

```markdown
# Test Patterns and Examples

This document provides detailed examples of good and bad patterns for each Test Desiderata property.

**Attribution:** All Test Desiderata concepts are created by Kent Beck. This document applies those principles with concrete examples.

## 1. Isolated - Independence in Execution Order

### Bad: Shared State
```python
# Tests share class-level variable
class TestUser:
    user_count = 0
    
    def test_create_user(self):
        self.user_count += 1
        assert self.user_count == 1  # Fails if run after test_delete_user
    
    def test_delete_user(self):
        self.user_count -= 1
        assert self.user_count == -1  # Depends on execution order
```

### Good: Independent Setup
```python
class TestUser:
    def setUp(self):
        self.user_count = 0  # Fresh state per test
    
    def test_create_user(self):
        self.user_count += 1
        assert self.user_count == 1
    
    def test_delete_user(self):
        self.user_count = 0
        self.user_count -= 1
        assert self.user_count == -1
```

## 2. Composable - Reusable Test Components

### Bad: Monolithic Test
```python
def test_user_workflow(self):
    # Tests too many things at once
    user = create_user(name="Alice", email="[email protected]")
    assert user.name == "Alice"
    
    user.update_email("[email protected]")
    assert user.email == "[email protected]"
    
    user.add_role("admin")
    assert "admin" in user.roles
    
    user.deactivate()
    assert not user.is_active
```

### Good: Composable Components
```python
def create_test_user(name="Alice", email="[email protected]"):
    return create_user(name=name, email=email)

def test_user_creation():
    user = create_test_user()
    assert user.name == "Alice"

def test_email_update():
    user = create_test_user()
    user.update_email("[email protected]")
    assert user.email == "[email protected]"

def test_role_assignment():
    user = create_test_user()
    user.add_role("admin")
    assert "admin" in user.roles
```

## 3. Deterministic - Consistent Results

### Bad: Time-Dependent
```python
def test_expiration():
    session = create_session(duration=60)
    time.sleep(61)  # Flaky - timing issues
    assert session.is_expired()
```

### Good: Controlled Time
```python
def test_expiration(mock_time):
    session = create_session(duration=60)
    mock_time.advance(seconds=61)
    assert session.is_expired()
```

### Bad: Random Data
```python
def test_validation():
    email = f"user{random.randint(1, 1000)}@example.com"
    assert is_valid_email(email)  # Different input each run
```

### Good: Fixed Data
```python
def test_validation():
    assert is_valid_email("[email protected]")
    assert not is_valid_email("invalid-email")
```

## 4. Fast - Quick Execution

### Bad: Unnecessary I/O
```python
def test_user_service():
    # Writes to real database
    db = Database(connection_string)
    user = User(name="Alice")
    db.save(user)
    
    retrieved = db.get_user(user.id)
    assert retrieved.name == "Alice"
```

### Good: In-Memory
```python
def test_user_service():
    # Uses in-memory repository
    repo = InMemoryUserRepository()
    user = User(name="Alice")
    repo.save(user)
    
    retrieved = repo.get_user(user.id)
    assert retrieved.name == "Alice"
```

## 5. Writable - Low Friction

### Bad: Excessive Setup
```python
def test_checkout():
    db = setup_database()
    load_fixtures(db)
    user = create_user_with_payment_method(db)
    cart = create_cart(db, user)
    add_items_to_cart(db, cart, [
        {"sku": "ABC", "quantity": 2, "price": 10.99},
        {"sku": "XYZ", "quantity": 1, "price": 5.99}
    ])
    payment_gateway = MockPaymentGateway()
    # Finally test something...
```

### Good: Test Builders
```python
def test_checkout():
    order = OrderBuilder()
        .with_user()
        .with_items([("ABC", 2), ("XYZ", 1)])
        .build()
    
    result = checkout(order)
    assert result.success
```

## 6. Readable - Clear Intent

### Bad: Unclear Purpose
```python
def test_func():
    x = calc(5, 3, True)
    assert x == 8
```

### Good: Expressive
```python
def test_calculator_adds_numbers_when_add_mode_enabled():
    # Arrange
    calculator = Calculator()
    calculator.set_mode(Mode.ADD)
    
    # Act
    result = calculator.calculate(5, 3)
    
    # Assert
    assert result == 8, "Calculator should add 5 + 3 to get 8"
```

## 7. Behavioral - Sensitive to Behavior Changes

### Bad: Tests Implementation
```python
def test_user_repository():
    repo = UserRepository()
    # Tests internal SQL query structure
    assert repo._build_query() == "SELECT * FROM users WHERE id = ?"
```

### Good: Tests Behavior
```python
def test_user_repository():
    repo = UserRepository()
    user = User(id=1, name="Alice")
    repo.save(user)
    
    retrieved = repo.find_by_id(1)
    assert retrieved.name == "Alice"
```

## 8. Structure-insensitive - Resilient to Refactoring

### Bad: Coupled to Structure
```python
def test_order_processor():
    processor = OrderProcessor()
    # Tests private method
    assert processor._validate_items([item1, item2]) == True
    # Tests internal structure
    assert isinstance(processor._payment_gateway, PayPalGateway)
```

### Good: Tests Public Interface
```python
def test_order_processor():
    processor = OrderProcessor()
    order = Order(items=[item1, item2])
    
    result = processor.process(order)
    
    assert result.success
    assert result.payment_confirmed
```

## 9. Automated - No Manual Steps

### Bad: Manual Verification
```python
def test_report_generation():
    report = generate_report()
    print(report)  # Developer must manually check output
    # No assertions
```

### Good: Automated Checks
```python
def test_report_generation():
    report = generate_report()
    
    assert "Summary" in report
    assert report.record_count == 100
    assert report.generated_date == today()
```

## 10. Specific - Clear Failure Diagnosis

### Bad: Multiple Assertions
```python
def test_user_creation():
    user = create_user("Alice", "[email protected]", "admin")
    assert user.name == "Alice" and user.email == "[email protected]" and user.role == "admin"
    # Which assertion failed?
```

### Good: One Assertion per Test
```python
def test_user_has_correct_name():
    user = create_user("Alice", "[email protected]", "admin")
    assert user.name == "Alice"

def test_user_has_correct_email():
    user = create_user("Alice", "[email protected]", "admin")
    assert user.email == "[email protected]"

def test_user_has_correct_role():
    user = create_user("Alice", "[email protected]", "admin")
    assert user.role == "admin"
```

## 11. Predictive - Production Readiness

### Bad: Missing Critical Scenarios
```python
def test_payment():
    # Only tests happy path
    payment = process_payment(amount=100, card="4111111111111111")
    assert payment.success
```

### Good: Comprehensive Coverage
```python
def test_payment_success():
    payment = process_payment(amount=100, card="4111111111111111")
    assert payment.success

def test_payment_insufficient_funds():
    payment = process_payment(amount=100, card=insufficient_funds_card)
    assert not payment.success
    assert payment.error == "Insufficient funds"

def test_payment_invalid_card():
    payment = process_payment(amount=100, card="invalid")
    assert not payment.success
    assert payment.error == "Invalid card number"

def test_payment_network_timeout():
    with mock_network_timeout():
        payment = process_payment(amount=100, card="4111111111111111")
        assert not payment.success
        assert payment.error == "Network timeout"
```

## 12. Inspiring - Confidence Building

### Bad: Trivial Test
```python
def test_user_class_exists():
    user = User()
    assert user is not None  # Tests nothing meaningful
```

### Good: Meaningful Verification
```python
def test_user_authentication_workflow():
    user = User(username="alice", password=hash("secret123"))
    
    # Successful login
    session = authenticate(username="alice", password="secret123")
    assert session.is_valid()
    assert session.user_id == user.id
    
    # Failed login with wrong password
    with pytest.raises(AuthenticationError):
        authenticate(username="alice", password="wrong")
```

## Common Patterns Across Properties

### Test Data Builders
Help with Writable, Readable, Composable:
```python
class UserBuilder:
    def __init__(self):
        self.name = "Default User"
        self.email = "[email protected]"
        self.role = "user"
    
    def with_name(self, name):
        self.name = name
        return self
    
    def with_admin_role(self):
        self.role = "admin"
        return self
    
    def build(self):
        return User(self.name, self.email, self.role)
```

### Fixture Factories
Help with Fast, Isolated, Writable:
```python
@pytest.fixture
def db_session():
    session = create_in_memory_session()
    yield session
    session.close()

@pytest.fixture
def sample_user(db_session):
    user = User(name="Alice")
    db_session.add(user)
    db_session.commit()
    return user
```

### Custom Assertions
Help with Readable, Specific:
```python
def assert_valid_email(email):
    """Assert email format is valid with clear error message"""
    if not re.match(r'^[\w\.-]+@[\w\.-]+\.\w+$', email):
        pytest.fail(f"'{email}' is not a valid email format")

def assert_user_has_role(user, expected_role):
    """Assert user has the expected role"""
    if expected_role not in user.roles:
        pytest.fail(
            f"User {user.name} does not have role '{expected_role}'. "
            f"Current roles: {user.roles}"
        )
```

```



---

## Skill Companion Files

> Additional files collected from the skill directory layout.

### README.md

```markdown
# Test Desiderata Skill

A Claude Code skill for analyzing and improving test quality using Kent Beck's Test Desiderata framework.

## Overview

This skill helps Claude Code evaluate test code across 12 quality dimensions and provide actionable recommendations for improvement. It's designed to work with any testing framework and programming language.

## What is Test Desiderata?

Test Desiderata is a framework created by Kent Beck that identifies 12 properties that make tests more valuable:

1. **Isolated** - Tests don't affect each other
2. **Composable** - Test components can be combined and reused
3. **Deterministic** - Same input always produces same result
4. **Fast** - Tests run quickly
5. **Writable** - Tests are easy to write
6. **Readable** - Tests are easy to understand
7. **Behavioral** - Tests verify behavior, not implementation
8. **Structure-insensitive** - Tests survive refactoring
9. **Automated** - Tests run without human intervention
10. **Specific** - Failures clearly indicate the problem
11. **Predictive** - Passing tests mean production-ready code
12. **Inspiring** - Tests build confidence in the system

## When to Use This Skill

The skill activates automatically when you:
- Analyze test files
- Review test code
- Ask about test quality or best practices
- Request suggestions for test improvements

You can also explicitly invoke it with `/test-desiderata`

## Usage Examples

### Analyze Existing Tests

```
"Review the tests in tests/user_service_test.py and identify quality issues"
```

Claude will evaluate the tests against all 12 Test Desiderata properties and provide specific feedback.

### Check Specific Properties

```
"Are these tests well isolated?"
"How can I make these tests faster without sacrificing coverage?"
```

### Get Improvement Suggestions

```
"These tests are failing intermittently. What's wrong?"
"Suggest improvements for test readability"
```

### Understand Tradeoffs

```
"I want comprehensive tests but they're too slow. What should I do?"
```

Claude will explain which Test Desiderata properties conflict and suggest design patterns to resolve the tension.

## What You'll Get

When you use this skill, Claude provides:

1. **Specific Issues** - Exact code locations with problems
2. **Property Violations** - Which Test Desiderata principles are violated
3. **Impact Analysis** - Why the issue matters
4. **Concrete Fixes** - Actionable code examples
5. **Tradeoff Discussion** - When improvements conflict with other properties

Example output:
```
Issue: Test "test_user_creation" violates the Isolated property
Location: Line 45 - shares database connection across tests
Impact: Test results depend on execution order, causing intermittent failures
Fix: Use fresh database connection per test with proper cleanup
Tradeoff: Slightly slower but much more reliable
```

## Skill Contents

- **`SKILL.md`** - Main skill prompt with all 12 Test Desiderata properties
- **`reference.md`** - Detailed examples showing good vs. bad patterns for each property
- **`LICENSE.txt`** - Attribution to Kent Beck

## Philosophy

From Kent Beck:

> "Some properties support each other, some interfere with each other, and sometimes properties only SEEM to interfere - that's where good design makes the difference."

This skill helps you find those design opportunities where you can improve multiple properties simultaneously.

## Attribution

**All Test Desiderata concepts and principles are created by Kent Beck.**

Original resources:
- Website: [testdesiderata.com](https://testdesiderata.com/)
- Original essay: [Test Desiderata](https://medium.com/@kentbeck_7670/test-desiderata-94150638a4b3)
- Follow-up: [Programmer Test Principles](https://medium.com/@kentbeck_7670/programmer-test-principles-d01c064d7934)
- Videos: Each property has a 5-minute explanatory video on YouTube

This skill applies Kent Beck's principles to help you write better tests.

## Installation

### Project Installation
```bash
# From your project root
mkdir -p .claude/skills
cp -r path/to/test-desiderata .claude/skills/
```

### Personal Installation
```bash
# Available across all projects
mkdir -p ~/.claude/skills
cp -r path/to/test-desiderata ~/.claude/skills/
```

## License

MIT License with attribution to Kent Beck as creator of the Test Desiderata framework.

See [LICENSE.txt](LICENSE.txt) for full details.

```

test-desiderata | SkillHub