testing-strategies
Strategic guidance for choosing and implementing testing approaches across the test pyramid. Use when building comprehensive test suites that balance unit, integration, E2E, and contract testing for optimal speed and confidence. Covers multi-language patterns (TypeScript, Python, Go, Rust) and modern best practices including property-based testing, test data management, and CI/CD integration.
Packaged view
This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.
Install command
npx @skill-hub/cli install ancoleman-ai-design-components-testing-strategies
Repository
Skill path: skills/testing-strategies
Strategic guidance for choosing and implementing testing approaches across the test pyramid. Use when building comprehensive test suites that balance unit, integration, E2E, and contract testing for optimal speed and confidence. Covers multi-language patterns (TypeScript, Python, Go, Rust) and modern best practices including property-based testing, test data management, and CI/CD integration.
Open repositoryBest for
Primary workflow: Analyze Data & AI.
Technical facets: Full Stack, DevOps, Data / AI, Testing, Integration.
Target audience: everyone.
License: Unknown.
Original source
Catalog source: SkillHub Club.
Repository owner: ancoleman.
This is still a mirrored public skill entry. Review the repository before installing into production workflows.
What it helps with
- Install testing-strategies into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
- Review https://github.com/ancoleman/ai-design-components before adding testing-strategies to shared team environments
- Use testing-strategies for development workflows
Works across
Favorites: 0.
Sub-skills: 0.
Aggregator: No.
Original source / Raw SKILL.md
---
name: testing-strategies
description: Strategic guidance for choosing and implementing testing approaches across the test pyramid. Use when building comprehensive test suites that balance unit, integration, E2E, and contract testing for optimal speed and confidence. Covers multi-language patterns (TypeScript, Python, Go, Rust) and modern best practices including property-based testing, test data management, and CI/CD integration.
---
# Testing Strategies
Build comprehensive, effective test suites by strategically selecting and implementing the right testing approaches across unit, integration, E2E, and contract testing levels.
## Purpose
This skill provides strategic frameworks for:
- **Test Type Selection**: Determine when to use unit vs. integration vs. E2E vs. contract testing
- **Test Pyramid Balancing**: Optimize test distribution for fast feedback and reliable coverage
- **Multi-Language Implementation**: Apply consistent testing patterns across TypeScript, Python, Go, and Rust
- **Test Data Management**: Choose appropriate strategies (fixtures, factories, property-based testing)
- **CI/CD Integration**: Integrate tests into automated pipelines with optimal execution patterns
Testing is foundational to reliable software. With microservices architectures and continuous delivery becoming standard in 2025, strategic testing across multiple levels is more critical than ever.
## When to Use This Skill
Invoke this skill when:
- Building a new feature that requires test coverage
- Designing a testing strategy for a new project
- Refactoring existing tests to improve speed or reliability
- Setting up CI/CD pipelines with testing stages
- Choosing between unit, integration, or E2E testing approaches
- Implementing contract testing for microservices
- Managing test data with fixtures, factories, or property-based testing
## The Testing Pyramid Framework
### Core Concept
The testing pyramid guides test distribution for optimal speed and confidence:
```
/\
/ \ E2E Tests (10%)
/----\ - Slow but comprehensive
/ \ - Full stack validation
/--------\
/ \ Integration Tests (20-30%)
/ \ - Moderate speed
/--------------\ - Component interactions
/ \
/------------------\ Unit Tests (60-70%)
- Fast feedback
- Isolated units
```
**Key Principle**: More unit tests (fast, isolated), fewer E2E tests (slow, comprehensive). Integration tests bridge the gap.
### Modern Adaptations (2025)
**Microservices Adjustment**:
- Add contract testing layer between unit and integration
- Increase integration/contract tests to 30% (validate service boundaries)
- Reduce E2E tests to critical user journeys only
**Cloud-Native Patterns**:
- Use containers for integration tests (ephemeral databases, test services)
- Parallel execution for fast CI/CD feedback
- Risk-based test prioritization (focus on high-impact areas)
For detailed pyramid guidance, see `references/testing-pyramid.md`.
## Universal Testing Decision Tree
### Which Test Type Should I Use?
```
START: Need to test [feature]
Q1: Does this involve multiple systems/services?
├─ YES → Q2
└─ NO → Q3
Q2: Is this a critical user-facing workflow?
├─ YES → E2E Test (complete user journey)
└─ NO → Integration or Contract Test
Q3: Does this interact with external dependencies (DB, API, filesystem)?
├─ YES → Integration Test (real DB, mocked API)
└─ NO → Q4
Q4: Is this pure business logic or a pure function?
├─ YES → Unit Test (fast, isolated)
└─ NO → Component or Integration Test
```
### Test Type Selection Examples
| Feature | Test Type | Rationale |
|---------|-----------|-----------|
| `calculateTotal(items)` | Unit | Pure function, no dependencies |
| `POST /api/users` endpoint | Integration | Tests API + database interaction |
| User registration flow (form → API → redirect) | E2E | Critical user journey, full stack |
| Microservice A → B communication | Contract | Service interface validation |
| `formatCurrency(amount, locale)` | Unit + Property | Pure logic, many edge cases |
| Form validation logic | Unit | Isolated business rules |
| File upload to S3 | Integration | External service interaction |
For comprehensive decision frameworks, see `references/decision-tree.md`.
## Testing Levels in Detail
### Unit Testing (Foundation - 60-70%)
**Purpose**: Validate small, isolated units of code (functions, methods, components)
**Characteristics**:
- Fast (milliseconds per test)
- Isolated (no external dependencies)
- Deterministic (same input = same output)
- Broad coverage (many tests, small scope each)
**When to Use**:
- Pure functions (input → output)
- Business logic and algorithms
- Utility functions
- Component rendering (without integration)
- Validation logic
**Recommended Tools**:
- **TypeScript/JavaScript**: Vitest (primary, 10x faster than Jest), Jest (legacy)
- **Python**: pytest (industry standard)
- **Go**: testing package (stdlib) + testify (assertions)
- **Rust**: cargo test (stdlib)
For detailed patterns, see `references/unit-testing-patterns.md`.
### Integration Testing (Middle Layer - 20-30%)
**Purpose**: Validate interactions between components, modules, or services
**Characteristics**:
- Moderate speed (seconds per test)
- Partial integration (real database, mocked external APIs)
- Focused scope (test component boundaries)
- API and database validation
**When to Use**:
- API endpoints (request → response)
- Database operations (CRUD, queries)
- Service-to-service communication
- Event handlers and message processing
- File I/O operations
**Recommended Tools**:
- **TypeScript/JavaScript**: Vitest + MSW (API mocking), Supertest (HTTP testing)
- **Python**: pytest + pytest-httpserver, pytest-postgresql
- **Go**: testing + httptest, testcontainers
- **Rust**: cargo test + mockito, testcontainers
For detailed patterns, see `references/integration-testing-patterns.md`.
### End-to-End Testing (Top Layer - 10%)
**Purpose**: Validate complete user workflows across the entire application stack
**Characteristics**:
- Slow (minutes per test suite)
- Full integration (real browser, services, database)
- Wide scope (user journeys from start to finish)
- Prone to flakiness (requires careful design)
**When to Use**:
- Critical user journeys (login, checkout, payment)
- Cross-browser compatibility validation
- Real-world scenarios not covered by integration tests
- Regression prevention for core features
**Best Practices**:
- Limit E2E tests to high-value scenarios (not every edge case)
- Use stable selectors (data-testid, not CSS classes)
- Implement retry logic for network flakiness
- Run tests in parallel for speed
**Recommended Tools**:
- **All Languages**: Playwright (cross-browser, fast, Microsoft-backed)
For detailed patterns, see `references/e2e-testing-patterns.md`.
### Contract Testing (Microservices)
**Purpose**: Validate service interfaces and API contracts without full integration
**When to Use**:
- Microservices architecture
- Service-to-service communication
- API contract validation
- Reducing E2E testing overhead
**Recommended Tool**: Pact (pact.io) - supports TypeScript, Python, Go, Rust
For detailed patterns, see `references/contract-testing.md`.
## Test Data Management Strategies
### When to Use Each Approach
**Fixtures (Static Data)**:
- Pros: Deterministic, easy to debug
- Cons: Can become stale, doesn't test variety
- Use When: Testing known scenarios, regression tests
**Factories (Generated Data)**:
- Pros: Flexible, generates variety
- Cons: Less deterministic, harder to debug
- Use When: Need diverse test data, testing edge cases
**Property-Based Testing (Random Data)**:
- Pros: Finds edge cases not anticipated
- Cons: Can be slow, failures harder to reproduce
- Use When: Complex algorithms, parsers, validators
**Recommended Combination**:
- **Unit Tests**: Fixtures (known inputs) + Property-Based (edge cases)
- **Integration Tests**: Factories (flexible data) + Database seeding
- **E2E Tests**: Fixtures (reproducible scenarios)
**Property-Based Testing Tools**:
- **TypeScript/JavaScript**: fast-check
- **Python**: hypothesis (best-in-class)
- **Go**: gopter
- **Rust**: proptest (primary)
For detailed strategies, see `references/test-data-strategies.md`.
## Mocking Decision Matrix
### When to Mock vs. Use Real Dependencies
| Dependency | Unit Test | Integration Test | E2E Test |
|------------|-----------|------------------|----------|
| **Database** | Mock (in-memory) | Real (test DB, Docker) | Real (staging DB) |
| **External API** | Mock (MSW, nock) | Mock (MSW, VCR) | Real (or staging) |
| **Filesystem** | Mock (in-memory FS) | Real (temp directory) | Real |
| **Time/Date** | Mock (freezeTime) | Mock (if deterministic) | Real (usually) |
| **Environment Variables** | Mock (setEnv) | Mock (test config) | Real (test env) |
| **Internal Services** | Mock (stub) | Real (or container) | Real |
**Guiding Principles**:
1. **Unit Tests**: Mock everything external
2. **Integration Tests**: Use real database (ephemeral), mock external APIs
3. **E2E Tests**: Use real everything (or staging equivalents)
4. **Contract Tests**: Mock nothing (test real interfaces)
For detailed mocking patterns, see `references/mocking-strategies.md`.
## Language-Specific Quick Starts
### TypeScript/JavaScript
**Unit Testing with Vitest**:
```typescript
import { describe, test, expect } from 'vitest'
test('calculates total with tax', () => {
const items = [{ price: 10, quantity: 2 }]
expect(calculateTotal(items, 0.1)).toBe(22)
})
```
**Integration Testing with MSW**:
```typescript
import { setupServer } from 'msw/node'
import { http, HttpResponse } from 'msw'
const server = setupServer(
http.get('/api/user/:id', ({ params }) => {
return HttpResponse.json({ id: params.id, name: 'Test User' })
})
)
```
**E2E Testing with Playwright**:
```typescript
import { test, expect } from '@playwright/test'
test('user can checkout', async ({ page }) => {
await page.goto('https://example.com')
await page.getByRole('button', { name: 'Add to Cart' }).click()
await expect(page.getByText('1 item in cart')).toBeVisible()
})
```
See `examples/typescript/` for complete working examples.
### Python
**Unit Testing with pytest**:
```python
def test_calculate_total_with_tax():
items = [{"price": 10, "quantity": 2}]
assert calculate_total(items, tax_rate=0.1) == 22
```
**Property-Based Testing with hypothesis**:
```python
from hypothesis import given, strategies as st
@given(st.lists(st.integers()))
def test_reverse_reverse_is_identity(lst):
assert reverse(reverse(lst)) == lst
```
See `examples/python/` for complete working examples.
### Go and Rust
See `examples/go/` and `examples/rust/` for complete working examples in these languages.
## Coverage and Quality Metrics
### Meaningful Coverage Targets
**Recommended Targets**:
- **Critical Business Logic**: 90%+ coverage
- **API Endpoints**: 80%+ coverage
- **Utility Functions**: 70%+ coverage
- **UI Components**: 60%+ coverage (focus on logic, not markup)
- **Overall Project**: 70-80% coverage
**Anti-Pattern**: Aiming for 100% coverage leads to testing trivial code and false confidence.
**Coverage Tools**:
- **TypeScript/JavaScript**: Vitest coverage (c8/istanbul)
- **Python**: pytest-cov
- **Go**: go test -cover
- **Rust**: cargo-tarpaulin
### Mutation Testing (Validate Test Quality)
Mutation testing introduces bugs and verifies tests catch them.
**Tools**:
- **TypeScript/JavaScript**: Stryker Mutator
- **Python**: mutmut
- **Go**: go-mutesting
- **Rust**: cargo-mutants
For detailed coverage strategies, see `references/coverage-strategies.md`.
## CI/CD Integration Patterns
### Fast Feedback Loop Strategy
**Stage 1: Pre-Commit (< 30 seconds)**
- Lint and format checks
- Unit tests (critical paths only)
**Stage 2: On Commit (< 2 minutes)**
- All unit tests
- Static analysis
**Stage 3: Pull Request (< 5 minutes)**
- Integration tests
- Coverage reporting
**Stage 4: Pre-Merge (< 10 minutes)**
- E2E tests (critical paths)
- Cross-browser testing
### Parallel Execution
**Speed Up Test Runs**:
- **Vitest**: `vitest --threads`
- **Playwright**: `playwright test --workers=4`
- **pytest**: `pytest -n auto` (requires pytest-xdist)
- **Go**: `go test -parallel 4`
## Common Anti-Patterns to Avoid
### The Ice Cream Cone (Inverted Pyramid)
**Problem**: Too many E2E tests, too few unit tests
**Result**: Slow test suites, flaky tests, long feedback loops
**Solution**: Rebalance toward more unit tests, fewer E2E tests
### Testing Implementation Details
**Problem**: Tests coupled to internal structure, not behavior
**Solution**: Test behavior (inputs → outputs), not implementation
### Over-Mocking in Integration Tests
**Problem**: Mocking everything defeats purpose of integration tests
**Solution**: Use real databases (ephemeral), mock only external APIs
### Flaky E2E Tests
**Problem**: Tests fail intermittently due to timing issues
**Solutions**:
- Use auto-wait features (Playwright has built-in auto-wait)
- Avoid hardcoded waits (`sleep(1000)`)
- Use stable selectors (data-testid)
- Implement retry logic for network requests
## Integration with Other Skills
**For form testing**: See `building-forms` skill for form-specific validation and submission testing patterns.
**For API testing**: See `api-patterns` skill for REST/GraphQL endpoint testing and contract validation.
**For CI/CD pipelines**: See `building-ci-pipelines` skill for test automation, parallel execution, and coverage reporting.
**For data visualization testing**: See `visualizing-data` skill for snapshot testing chart configurations and visual regression testing.
## Reference Documentation
For deeper exploration of specific topics:
- **`references/testing-pyramid.md`** - Detailed testing pyramid framework and balancing strategies
- **`references/decision-tree.md`** - Comprehensive decision frameworks for test type selection
- **`references/unit-testing-patterns.md`** - Unit testing patterns across languages
- **`references/integration-testing-patterns.md`** - Integration testing with databases, APIs, and containers
- **`references/e2e-testing-patterns.md`** - E2E testing best practices and Playwright patterns
- **`references/contract-testing.md`** - Consumer-driven contract testing with Pact
- **`references/test-data-strategies.md`** - Fixtures, factories, and property-based testing
- **`references/mocking-strategies.md`** - When and how to mock dependencies
- **`references/coverage-strategies.md`** - Meaningful coverage metrics and mutation testing
## Working Examples
Complete, runnable code examples are available in the `examples/` directory:
- **`examples/typescript/`** - Vitest unit/integration, Playwright E2E, MSW mocking, fast-check property tests
- **`examples/python/`** - pytest unit/integration, fixtures, Playwright E2E, hypothesis property tests
- **`examples/go/`** - stdlib testing, testify assertions, httptest integration
- **`examples/rust/`** - cargo test unit tests, proptest property tests, integration patterns
All examples include dependencies, usage instructions, and error handling.
---
## Referenced Files
> The following files are referenced in this skill and included for context.
### references/testing-pyramid.md
```markdown
# Testing Pyramid Framework
## Table of Contents
1. [Core Concept](#core-concept)
2. [Test Distribution Guidelines](#test-distribution-guidelines)
3. [Speed vs. Confidence Trade-offs](#speed-vs-confidence-trade-offs)
4. [Pyramid Variations](#pyramid-variations)
5. [Anti-Patterns](#anti-patterns)
6. [Balancing Strategies](#balancing-strategies)
## Core Concept
The testing pyramid is a visual metaphor for organizing test types to optimize for both speed and confidence:
```
/\
/ \ E2E Tests (10%)
/----\ - Full stack validation
/ \ - Slow execution (minutes)
/--------\
/ \ Integration Tests (20-30%)
/ \ - Component interactions
/--------------\ - Moderate speed (seconds)
/ \
/------------------\ Unit Tests (60-70%)
- Isolated units
- Fast execution (milliseconds)
```
### Why This Shape?
**Foundation (Unit Tests - Wide Base)**:
- Fast feedback (milliseconds per test)
- Easy to debug (small scope)
- Cheap to maintain (simple dependencies)
- High confidence in individual components
**Middle (Integration Tests - Moderate)**:
- Validate component interactions
- Real database/API interactions
- Moderate speed (seconds per test)
- Confidence in integration points
**Top (E2E Tests - Narrow Peak)**:
- Expensive to run (minutes per suite)
- Prone to flakiness (many moving parts)
- Hard to debug (full stack involved)
- High confidence in complete workflows
## Test Distribution Guidelines
### The 70/20/10 Rule (Traditional Applications)
**70% Unit Tests**:
- Business logic
- Pure functions
- Utilities and helpers
- Component logic (without integration)
**20% Integration Tests**:
- API endpoints
- Database operations
- Service-to-service communication
- Component integration with state/data
**10% E2E Tests**:
- Critical user journeys
- Core business workflows
- Payment/checkout flows
- Authentication flows
### The 60/30/10 Rule (Microservices)
**60% Unit Tests**:
- Still the foundation
- Business logic within each service
**30% Integration/Contract Tests**:
- Service boundaries critical
- Consumer-driven contracts (Pact)
- API integration tests
- Database integration
**10% E2E Tests**:
- User-facing workflows only
- Avoid testing every service integration via E2E
### Measuring Your Distribution
Count tests by type and calculate percentages:
```bash
# TypeScript/JavaScript (Vitest)
vitest --reporter=json > test-results.json
# Python (pytest)
pytest --collect-only -q | wc -l
# Go
go test -v ./... 2>&1 | grep -c "^=== RUN"
# Rust
cargo test -- --list | wc -l
```
Analyze results:
- Too many E2E tests? → Slow CI, flaky tests
- Too many unit tests? → Missing integration validation
- Too many integration tests? → Slow feedback loop
## Speed vs. Confidence Trade-offs
### Speed Characteristics
| Test Type | Avg. Execution | 1000 Tests | Feedback Loop |
|-----------|----------------|------------|---------------|
| **Unit** | 5-10ms | ~5-10 seconds | Immediate |
| **Integration** | 50-500ms | ~1-8 minutes | Fast |
| **E2E** | 5-30 seconds | ~1.5-8 hours | Slow |
### Confidence Characteristics
| Test Type | Scope | Failure Detection | False Positives |
|-----------|-------|-------------------|-----------------|
| **Unit** | Single function | Logic errors | Low |
| **Integration** | Component boundaries | Interface errors | Medium |
| **E2E** | Full application | User-facing errors | High (flaky) |
### Optimal Balance
**Goal**: Maximum confidence with minimum execution time
**Strategy**:
1. **Unit tests** catch logic errors early (fast feedback)
2. **Integration tests** catch interface errors (moderate confidence)
3. **E2E tests** catch workflow errors (high confidence, sparingly)
**Example Calculation**:
```
Project with 1000 total tests:
Option A (Inverted Pyramid - BAD):
- 100 unit (10%) × 5ms = 0.5s
- 200 integration (20%) × 100ms = 20s
- 700 E2E (70%) × 10s = 7000s (116 minutes)
Total: ~117 minutes
Option B (Proper Pyramid - GOOD):
- 700 unit (70%) × 5ms = 3.5s
- 200 integration (20%) × 100ms = 20s
- 100 E2E (10%) × 10s = 1000s (16.6 minutes)
Total: ~17 minutes
Result: Option B is 7x faster with similar confidence
```
## Pyramid Variations
### Testing Trophy (React/Frontend Focus)
Promoted by Kent C. Dodds for frontend applications:
```
/\
/ \ E2E (Few)
/----\
/ \
/--------\ Integration (Most - Component + API)
/----------\
/------------\
/ Static \ Static Analysis (Linting, TypeScript)
```
**Key Difference**: Emphasizes integration tests over unit tests for UI components
**Rationale**:
- UI components often have little isolated logic
- Integration tests (render + interactions) provide more value
- Static analysis (TypeScript) catches many bugs unit tests would
**Best For**: React, Vue, Svelte applications with component-based architecture
### Testing Diamond (Microservices)
```
/\
/ \ E2E (Few)
/----\
/------\ Contract Tests (More)
/--------\
/----------\ Integration (More)
/------------\
/ Unit \ Unit (Slightly Less)
```
**Key Difference**: Contract testing layer between integration and E2E
**Rationale**:
- Service boundaries are critical
- Contract tests validate interfaces without E2E overhead
- Reduces reliance on slow E2E tests
**Best For**: Microservices, service-oriented architectures
### Testing Honeycomb (GUI-Heavy Applications)
Promoted by Spotify for GUI applications:
```
/----\
/ \ E2E (Few, Critical Paths)
/--------\
/----------\ Integration (Component + API - MAJORITY)
/------------\
/ Unit \ Unit (Less - Logic Only)
```
**Key Difference**: Most tests at integration level (components with real interactions)
**Rationale**:
- GUI logic often trivial in isolation
- Component integration tests catch most bugs
- E2E tests for critical paths only
**Best For**: Desktop applications, complex UI applications
## Anti-Patterns
### The Ice Cream Cone (Inverted Pyramid)
**Description**:
```
/------------\ E2E (Too Many)
/--------\ Integration
/----\ Unit (Too Few)
/ \
```
**Symptoms**:
- CI/CD takes 30+ minutes
- Tests fail intermittently (flaky)
- Hard to identify root cause of failures
- Developers avoid running full test suite locally
**Causes**:
- "E2E tests are more realistic" mindset
- Lack of unit testing discipline
- Poor test architecture planning
**Solution**:
- Refactor E2E tests into integration + unit tests
- Keep E2E tests for critical paths only
- Add unit tests for business logic
### The Testing Hourglass
**Description**:
```
/------------\ E2E (Too Many)
/--\ Integration (Too Few)
/----\ Unit (Too Many)
/ \
```
**Symptoms**:
- Unit tests pass, E2E tests pass, but integration bugs slip through
- Issues only found in production
- Missing validation of component boundaries
**Causes**:
- Over-focus on unit and E2E
- Neglecting integration layer
- Poor component boundary design
**Solution**:
- Add integration tests for API endpoints
- Add integration tests for database operations
- Test service-to-service interactions
### The Testing Rectangle (Flat Distribution)
**Description**:
```
/------------------\ E2E (33%)
/------------------\ Integration (33%)
/------------------\ Unit (33%)
```
**Symptoms**:
- Test suite moderately slow
- Unclear when to write which test type
- Redundant coverage (same scenario tested at all levels)
**Causes**:
- No clear testing strategy
- "Test everything everywhere" approach
- Lack of pyramid awareness
**Solution**:
- Define clear boundaries for each test type
- Remove redundant tests
- Focus each test type on its strengths
## Balancing Strategies
### Risk-Based Test Distribution
**High-Risk Features** (payment, authentication, data integrity):
- 80% unit tests (thorough logic validation)
- 15% integration tests (API/DB validation)
- 5% E2E tests (critical paths)
**Medium-Risk Features** (user profiles, settings):
- 70% unit tests
- 25% integration tests
- 5% E2E tests
**Low-Risk Features** (UI polish, non-critical features):
- 60% unit tests
- 30% integration tests
- 10% E2E tests (or none)
### Speed-Driven Distribution
**Fast Feedback Required** (tight development loops):
- 80% unit tests (instant feedback)
- 15% integration tests (run on commit)
- 5% E2E tests (run before merge)
**Confidence-Driven** (critical production systems):
- 60% unit tests
- 30% integration tests
- 10% E2E tests (comprehensive coverage)
### Team Size Considerations
**Small Team (1-5 developers)**:
- Focus on unit and integration tests
- Minimal E2E tests (manual QA acceptable)
- Fast iteration more important
**Medium Team (5-20 developers)**:
- Balanced pyramid (70/20/10)
- E2E tests for critical paths
- Automated CI/CD essential
**Large Team (20+ developers)**:
- Strict pyramid enforcement (avoid E2E explosion)
- Contract testing for service boundaries
- Parallel test execution critical
### Monitoring and Adjusting
**Measure Test Suite Health**:
```bash
# Test execution time
time npm test
# Flakiness rate
(failed_runs / total_runs) * 100
# Coverage by test type
vitest --coverage --reporter=json
```
**Red Flags**:
- Unit tests taking >5 minutes
- Integration tests taking >10 minutes
- E2E tests taking >30 minutes
- Flakiness rate >5%
- Coverage dropping below thresholds
**Rebalancing Actions**:
1. Profile slow tests (identify outliers)
2. Refactor slow unit tests (remove dependencies)
3. Move slow E2E tests to integration tests
4. Parallelize test execution
5. Remove redundant tests
## Summary
**Key Takeaways**:
1. **The pyramid shape is intentional**: More fast tests, fewer slow tests
2. **Distribution guidelines are starting points**: Adjust based on your context (monolith vs. microservices, frontend vs. backend)
3. **Speed matters**: Fast tests enable continuous testing and faster development
4. **Confidence matters too**: Balance speed with comprehensive coverage
5. **Measure and adjust**: Monitor test suite health and rebalance as needed
**Next Steps**:
- Audit your current test distribution
- Identify anti-patterns (ice cream cone, hourglass)
- Refactor toward pyramid shape
- Establish clear guidelines for test type selection
- Monitor and maintain balance over time
```
### references/decision-tree.md
```markdown
# Test Type Selection Decision Tree
## Table of Contents
1. [Primary Decision Framework](#primary-decision-framework)
2. [Feature-Based Decision Making](#feature-based-decision-making)
3. [Architecture-Driven Decisions](#architecture-driven-decisions)
4. [Edge Case Scenarios](#edge-case-scenarios)
5. [Multi-Level Testing](#multi-level-testing)
## Primary Decision Framework
### The Core Decision Tree
```
START: Need to test [feature/function/component]
Q1: What am I testing?
├─ Pure function/business logic → Go to Q2
├─ API endpoint → Go to Q3
├─ Database operation → Go to Q4
├─ UI component → Go to Q5
├─ User workflow → Go to Q6
└─ Service integration → Go to Q7
Q2: Pure Function/Business Logic
├─ Has simple inputs/outputs?
│ └─ YES → Unit Test (vitest, pytest, testing)
├─ Has many edge cases?
│ └─ YES → Unit Test + Property-Based Test (fast-check, hypothesis, proptest)
└─ Complex algorithm?
└─ YES → Unit Test + Snapshot Test
Q3: API Endpoint
├─ Simple CRUD operation?
│ └─ YES → Integration Test (database + endpoint)
├─ Complex business logic?
│ └─ YES → Unit Test (logic) + Integration Test (endpoint)
├─ External API calls?
│ └─ YES → Integration Test with Mocked External API (MSW, pytest-httpserver)
└─ Critical to user workflow?
└─ YES → Integration Test + E2E Test
Q4: Database Operation
├─ Repository method (CRUD)?
│ └─ YES → Integration Test (real test database)
├─ Complex query?
│ └─ YES → Integration Test (validate query correctness)
└─ Transaction handling?
└─ YES → Integration Test (test rollback, commit)
Q5: UI Component
├─ Stateless presentation?
│ └─ YES → Unit Test (rendering) or Snapshot Test
├─ Interactive with state?
│ └─ YES → Integration Test (component + state management)
├─ Data fetching?
│ └─ YES → Integration Test (component + API mocking)
└─ Complex user interaction?
└─ YES → E2E Test (user workflow)
Q6: User Workflow
├─ Critical path (login, checkout, payment)?
│ └─ YES → E2E Test (full stack)
├─ Multi-step process?
│ └─ YES → E2E Test for happy path + Integration Tests for edge cases
└─ Cross-browser required?
└─ YES → E2E Test with Playwright (multiple browsers)
Q7: Service Integration (Microservices)
├─ Service-to-service communication?
│ └─ YES → Contract Test (Pact - consumer + provider)
├─ Message queue/event bus?
│ └─ YES → Integration Test (publish + consume)
└─ API gateway?
└─ YES → Integration Test (routing + transformation)
```
## Feature-Based Decision Making
### Authentication Features
| Feature | Test Type | Tools | Rationale |
|---------|-----------|-------|-----------|
| Password hashing | Unit | Vitest, pytest | Pure function, deterministic |
| Token generation | Unit + Property-Based | fast-check, hypothesis | Many edge cases |
| Login endpoint | Integration | Supertest, FastAPI TestClient | Validates DB + auth logic |
| Login flow (UI) | E2E | Playwright | Critical user journey |
| OAuth callback | Integration | Mock OAuth provider | External dependency |
| Session validation | Unit | Standard test framework | Logic-only validation |
### E-Commerce Features
| Feature | Test Type | Tools | Rationale |
|---------|-----------|-------|-----------|
| Product price calculation | Unit | Vitest, pytest | Pure business logic |
| Add to cart | Integration | Component + state test | State management + UI |
| Checkout flow | E2E | Playwright | Critical revenue path |
| Payment processing | Integration + E2E | Mock payment API, full flow | Critical + external API |
| Order confirmation | Integration | Email mock, DB validation | Multiple system interaction |
| Inventory update | Integration | Real DB, transaction test | Data integrity critical |
### Data Processing Features
| Feature | Test Type | Tools | Rationale |
|---------|-----------|-------|-----------|
| Data transformation | Unit + Property-Based | fast-check, hypothesis | Complex logic, edge cases |
| File parsing | Unit + Snapshot | Standard test framework | Consistent output |
| Data validation | Unit + Property-Based | Validation-specific | Many input variations |
| Batch processing | Integration | Real DB, test data | DB interaction, performance |
| API data fetch | Integration | MSW, pytest-httpserver | External dependency |
| Report generation | Integration + Snapshot | DB + output capture | Complex query + formatting |
## Architecture-Driven Decisions
### Monolithic Application
**Testing Strategy**:
- **70% Unit Tests**: Business logic, utilities, helpers
- **20% Integration Tests**: API endpoints, database operations
- **10% E2E Tests**: Critical user journeys
**Decision Factors**:
- Single deployment unit → More integration tests acceptable
- Shared database → Integration tests validate data consistency
- Tightly coupled components → E2E tests for critical paths
**Example**:
```
Rails/Django Monolith:
- Unit: Model methods, service objects, helpers
- Integration: Controller/view actions, database queries
- E2E: User signup, checkout, admin workflows
```
### Microservices Architecture
**Testing Strategy**:
- **60% Unit Tests**: Service-specific logic
- **30% Integration/Contract Tests**: Service boundaries, API contracts
- **10% E2E Tests**: User-facing workflows only
**Decision Factors**:
- Distributed services → Contract tests essential
- Service boundaries → Integration tests for each service
- Avoid E2E test explosion → Focus on critical user journeys
**Example**:
```
Microservices E-Commerce:
- Unit: Product service logic, pricing calculations
- Contract: Product service → Cart service interface
- Integration: Each service's API endpoints
- E2E: Checkout flow (crosses all services)
```
### Serverless/Cloud Functions
**Testing Strategy**:
- **70% Unit Tests**: Function logic (handler + business logic)
- **25% Integration Tests**: Cloud service integration (S3, DynamoDB)
- **5% E2E Tests**: Complete workflows (rare, expensive)
**Decision Factors**:
- Stateless functions → Unit test handler logic
- Cloud service dependencies → Integration tests with emulators
- Event-driven → Integration tests for event handling
**Example**:
```
AWS Lambda Functions:
- Unit: Handler logic, data transformation
- Integration: DynamoDB operations (use LocalStack/DynamoDB Local)
- E2E: Complete user action (API Gateway → Lambda → DynamoDB)
```
### Frontend Single-Page Application (SPA)
**Testing Strategy**:
- **50% Unit Tests**: Utility functions, hooks, pure logic
- **40% Integration Tests**: Component + state + data fetching
- **10% E2E Tests**: Critical user flows
**Decision Factors**:
- Component-based → Integration tests more valuable than unit
- State management → Integration tests validate state transitions
- User interactions → E2E for complete workflows
**Example**:
```
React/Vue SPA:
- Unit: Hooks, utilities, formatters
- Integration: Component with Redux/Vuex, API mocking (MSW)
- E2E: Login → Dashboard → User action
```
## Edge Case Scenarios
### Complex Business Rules
**Scenario**: Tax calculation with multiple jurisdictions, exemptions, special rates
**Recommended Approach**:
1. **Unit Tests**: Core calculation logic with fixtures
2. **Property-Based Tests**: Generate random scenarios to find edge cases
3. **Integration Tests**: Full tax calculation API endpoint with database
**Example**:
```typescript
// Unit test: Known scenarios
test('calculates California sales tax', () => {
expect(calculateTax(100, 'CA')).toBe(107.25)
})
// Property-based: Edge cases
fc.assert(
fc.property(fc.float(), fc.string(), (amount, state) => {
const tax = calculateTax(amount, state)
return tax >= amount // Tax never reduces price
})
)
// Integration: Full endpoint
test('POST /calculate-tax returns correct tax', async () => {
const response = await request(app)
.post('/calculate-tax')
.send({ amount: 100, state: 'CA', items: [...] })
expect(response.body.totalTax).toBe(7.25)
})
```
### External API Integration
**Scenario**: Payment processing via Stripe, shipping via FedEx API
**Recommended Approach**:
1. **Unit Tests**: API client logic (request formatting, response parsing)
2. **Integration Tests**: Mock external API (MSW, VCR.py)
3. **E2E Tests**: Sandbox environment (Stripe test mode)
**Example**:
```python
# Unit test: Request formatting
def test_stripe_charge_request_format():
request = build_stripe_charge_request(amount=1000, currency='usd')
assert request['amount'] == 1000
assert request['currency'] == 'usd'
# Integration test: Mock Stripe API
@pytest.fixture
def mock_stripe(httpserver):
httpserver.expect_request('/v1/charges').respond_with_json({
'id': 'ch_test123',
'status': 'succeeded'
})
return httpserver
def test_charge_customer(mock_stripe):
charge = stripe_service.charge_customer(1000, 'usd', 'tok_visa')
assert charge['status'] == 'succeeded'
# E2E test: Stripe test mode
def test_checkout_flow_with_payment(page):
page.goto('/checkout')
page.fill('[data-testid="card-number"]', '4242424242424242')
page.click('[data-testid="submit-payment"]')
expect(page.locator('[data-testid="order-confirmation"]')).to_be_visible()
```
### Real-Time Features
**Scenario**: WebSocket chat, live dashboards, collaborative editing
**Recommended Approach**:
1. **Unit Tests**: Message parsing, state updates
2. **Integration Tests**: WebSocket connection, message handling
3. **E2E Tests**: Multi-client interaction (if critical)
**Example**:
```typescript
// Unit test: Message parsing
test('parses chat message', () => {
const parsed = parseMessage('{"type": "chat", "text": "Hello"}')
expect(parsed.type).toBe('chat')
expect(parsed.text).toBe('Hello')
})
// Integration test: WebSocket handling
test('handles incoming message', async () => {
const ws = new WebSocket('ws://localhost:3000')
ws.send(JSON.stringify({ type: 'chat', text: 'Hi' }))
const response = await waitForMessage(ws)
expect(response.type).toBe('chat')
})
// E2E test: Multi-client chat
test('messages appear across clients', async ({ browser }) => {
const page1 = await browser.newPage()
const page2 = await browser.newPage()
await page1.goto('/chat')
await page2.goto('/chat')
await page1.fill('[data-testid="message-input"]', 'Hello from user 1')
await page1.click('[data-testid="send-button"]')
await expect(page2.locator('text=Hello from user 1')).toBeVisible()
})
```
### Background Jobs/Workers
**Scenario**: Email sending, report generation, data synchronization
**Recommended Approach**:
1. **Unit Tests**: Job logic (data preparation, validation)
2. **Integration Tests**: Job execution with real queue (Redis, RabbitMQ)
3. **E2E Tests**: Trigger job, verify outcome (if user-visible)
**Example**:
```python
# Unit test: Job logic
def test_prepare_email_data():
data = prepare_email_data(user_id=123)
assert data['to'] == '[email protected]'
assert 'subject' in data
# Integration test: Job execution
def test_send_email_job(redis_queue):
job = send_email_job.delay(user_id=123)
job.wait(timeout=5)
assert job.status == 'SUCCESS'
assert job.result['sent'] is True
# E2E test: User trigger + outcome
def test_user_receives_confirmation_email(page, email_client):
page.goto('/signup')
page.fill('[name="email"]', '[email protected]')
page.click('[type="submit"]')
# Wait for background job
email = email_client.wait_for_email('[email protected]', timeout=10)
assert 'Welcome' in email.subject
```
## Multi-Level Testing
### When to Test at Multiple Levels
Some features benefit from testing at multiple levels for comprehensive coverage.
**Example: Password Validation**
```typescript
// Level 1: Unit Test (validation logic)
describe('validatePassword', () => {
test('rejects passwords shorter than 8 characters', () => {
expect(validatePassword('short')).toBe(false)
})
test('requires at least one number', () => {
expect(validatePassword('noNumbers')).toBe(false)
expect(validatePassword('hasNumber1')).toBe(true)
})
})
// Level 2: Integration Test (API endpoint)
describe('POST /signup', () => {
test('returns 400 for weak password', async () => {
const response = await request(app)
.post('/signup')
.send({ email: '[email protected]', password: 'weak' })
expect(response.status).toBe(400)
expect(response.body.error).toContain('password')
})
})
// Level 3: E2E Test (user workflow)
test('signup form shows password requirements', async ({ page }) => {
await page.goto('/signup')
await page.fill('[name="password"]', 'weak')
await page.click('[type="submit"]')
await expect(page.locator('.error')).toContainText('at least 8 characters')
})
```
**Rationale**:
- **Unit test**: Validates core logic quickly (many edge cases)
- **Integration test**: Validates API contract and error responses
- **E2E test**: Validates user experience and error messaging
**Trade-off**: More comprehensive coverage vs. test suite complexity. Use multi-level testing for:
- Critical features (authentication, payment)
- Complex business rules (tax calculation, pricing)
- User-facing validation (forms, input requirements)
## Summary
**Key Decision Factors**:
1. **Scope of test**: Single function → Unit, Multiple components → Integration, Full workflow → E2E
2. **Dependencies**: None → Unit, Database/API → Integration, Full stack → E2E
3. **Criticality**: High → Multi-level testing, Medium → Integration + Unit, Low → Unit only
4. **Architecture**: Monolith → More integration, Microservices → More contract, SPA → More component integration
**Quick Reference Table**:
| What | Dependencies | Critical? | Test Type |
|------|--------------|-----------|-----------|
| Pure function | None | Any | Unit |
| API endpoint | Database | Yes | Integration + E2E |
| API endpoint | Database | No | Integration |
| UI component | State management | No | Integration |
| User workflow | Full stack | Yes | E2E |
| Service interface | Other services | Yes | Contract + Integration |
**Next Steps**:
- Use this decision tree when planning tests for new features
- Review existing tests and identify misplaced test types
- Refactor tests to appropriate levels for better speed/confidence balance
```
### references/unit-testing-patterns.md
```markdown
# Unit Testing Patterns
This reference provides detailed unit testing patterns across TypeScript, Python, Go, and Rust.
See `examples/typescript/vitest-unit.test.ts` for complete TypeScript unit testing examples.
See `examples/python/pytest_unit_test.py` for complete Python unit testing examples.
## Key Patterns
**Pure Function Testing**: Test functions with no side effects.
**Parametrized Testing**: Test multiple inputs efficiently.
**Fixture Usage**: Reusable test data and setup.
**Edge Case Testing**: Handle boundary conditions.
For complete implementation details and working code, refer to the examples directory.
```
### references/integration-testing-patterns.md
```markdown
# Integration Testing Patterns
This reference provides detailed integration testing patterns including API testing, database integration, and service communication.
See `examples/typescript/vitest-integration.test.ts` for complete TypeScript integration testing examples with MSW.
## Key Patterns
**API Integration Testing**: Test endpoints with real database, mocked external APIs.
**Database Integration**: Use ephemeral databases (Docker, in-memory).
**Service Communication**: Test component interactions.
**Request/Response Validation**: Verify API contracts.
For complete implementation details and working code, refer to the examples directory and `mocking-strategies.md`.
```
### references/e2e-testing-patterns.md
```markdown
# End-to-End Testing Patterns
This reference provides detailed E2E testing patterns with Playwright across multiple languages.
## Key Patterns
**User Workflow Testing**: Test complete user journeys.
**Cross-Browser Testing**: Validate across Chromium, Firefox, WebKit.
**Stable Selectors**: Use data-testid attributes.
**Retry Logic**: Handle network flakiness.
**Fixtures for Authentication**: Reuse authenticated state.
## Recommended Tool: Playwright
Playwright is the recommended E2E testing tool for 2025:
- Cross-browser support (Chrome, Firefox, Safari)
- Fast and reliable with auto-wait
- Multi-language support (TypeScript, Python, Go)
- Built-in tooling (codegen, trace viewer)
For complete Playwright examples, see the examples directory.
```
### references/contract-testing.md
```markdown
# Contract Testing for Microservices
## Table of Contents
1. [Overview](#overview)
2. [Consumer-Driven Contracts](#consumer-driven-contracts)
3. [Pact Framework](#pact-framework)
4. [Implementation Examples](#implementation-examples)
## Overview
Contract testing validates service interfaces without requiring full end-to-end integration. This is critical for microservices architectures where services evolve independently.
### Why Contract Testing?
**Traditional Integration Testing Problems**:
- Requires all services running
- Slow feedback loops
- Brittle tests (multiple failure points)
- Hard to isolate issues
**Contract Testing Benefits**:
- Fast feedback (no full integration required)
- Independent service testing
- Clear interface contracts
- Prevents breaking changes
## Consumer-Driven Contracts
**Concept**: Consumers define expected contracts, providers verify they meet them.
**Workflow**:
1. Consumer writes contract test (expected request/response)
2. Contract published to Pact Broker
3. Provider verifies it can fulfill contract
4. Both services deploy independently
## Pact Framework
Pact is the industry-standard tool for consumer-driven contract testing.
**Supported Languages**: TypeScript, Python, Go, Rust, Java, .NET, Ruby
### TypeScript Consumer Example
```typescript
import { PactV3 } from '@pact-foundation/pact'
const provider = new PactV3({
consumer: 'UserServiceClient',
provider: 'UserService'
})
test('fetch user by ID', async () => {
await provider
.given('user 123 exists')
.uponReceiving('a request for user 123')
.withRequest({
method: 'GET',
path: '/users/123'
})
.willRespondWith({
status: 200,
body: {
id: '123',
name: string,
email: string
}
})
await provider.executeTest(async (mockServer) => {
const user = await fetchUser('123', mockServer.url)
expect(user.name).toBeDefined()
})
})
```
### Python Provider Verification
```python
from pact import Verifier
verifier = Verifier(
provider='UserService',
provider_base_url='http://localhost:8000'
)
success = verifier.verify_pacts(
'./pacts/UserServiceClient-UserService.json',
provider_states_setup_url='http://localhost:8000/_pact/provider_states'
)
assert success == 0
```
## Implementation Examples
For complete working examples, refer to:
- Pact official documentation: https://docs.pact.io
- Language-specific guides in the Pact repository
## Summary
Contract testing is essential for microservices:
- Faster than E2E tests
- Independent service development
- Clear interface contracts
- Prevents breaking changes
Use Pact for consumer-driven contract testing across all major languages.
```
### references/test-data-strategies.md
```markdown
# Test Data Management Strategies
## Table of Contents
1. [Overview](#overview)
2. [Fixtures (Static Data)](#fixtures-static-data)
3. [Factories (Generated Data)](#factories-generated-data)
4. [Property-Based Testing](#property-based-testing)
5. [Database Seeding](#database-seeding)
6. [Snapshot Testing](#snapshot-testing)
7. [Strategy Selection Guide](#strategy-selection-guide)
## Overview
Test data management is critical for reliable, maintainable tests. Choose the right strategy based on your testing needs.
### The Four Primary Strategies
| Strategy | Deterministic | Variety | Speed | Best For |
|----------|---------------|---------|-------|----------|
| **Fixtures** | ✅ High | ❌ Low | ⚡ Fast | Known scenarios, regression tests |
| **Factories** | ⚠️ Medium | ✅ High | ⚡ Fast | Flexible data, integration tests |
| **Property-Based** | ❌ Low | ✅ Very High | 🐌 Slow | Edge cases, complex algorithms |
| **Snapshots** | ✅ High | ❌ Low | ⚡ Fast | Output validation, UI components |
## Fixtures (Static Data)
### What Are Fixtures?
Predefined, static test data loaded before tests run.
**Characteristics**:
- Same data every test run
- Easy to understand and debug
- Version-controlled alongside tests
- Can become stale or outdated
### When to Use Fixtures
✅ **Use fixtures when**:
- Testing known scenarios (happy path, specific edge cases)
- Regression testing (ensure bug stays fixed)
- Need deterministic, reproducible tests
- Debugging failing tests (same data every run)
❌ **Avoid fixtures when**:
- Need variety (testing many scenarios)
- Data structure changes frequently
- Testing edge cases you haven't thought of
### TypeScript/JavaScript Example (Vitest)
**Inline fixtures**:
```typescript
import { describe, test, expect } from 'vitest'
import { calculateTotal } from './cart'
describe('calculateTotal', () => {
test('calculates total for known items', () => {
const items = [
{ id: 1, name: 'Widget', price: 10, quantity: 2 },
{ id: 2, name: 'Gadget', price: 5, quantity: 1 }
]
expect(calculateTotal(items)).toBe(25)
})
})
```
**External fixture files**:
```typescript
// fixtures/products.ts
export const sampleProducts = [
{ id: 1, name: 'Widget', price: 10, stock: 100 },
{ id: 2, name: 'Gadget', price: 5, stock: 50 }
]
// products.test.ts
import { sampleProducts } from './fixtures/products'
test('filters products by price', () => {
const filtered = filterByPrice(sampleProducts, { max: 7 })
expect(filtered).toHaveLength(1)
expect(filtered[0].name).toBe('Gadget')
})
```
**JSON fixtures**:
```typescript
// fixtures/api-response.json
{
"users": [
{ "id": 1, "name": "Alice", "email": "[email protected]" },
{ "id": 2, "name": "Bob", "email": "[email protected]" }
]
}
// api.test.ts
import apiResponse from './fixtures/api-response.json'
test('parses API response', () => {
const users = parseUsers(apiResponse)
expect(users).toHaveLength(2)
})
```
### Python Example (pytest)
**pytest fixtures** (function-scoped):
```python
import pytest
@pytest.fixture
def sample_user():
"""Provide a sample user for testing"""
return {
'id': 1,
'name': 'Alice',
'email': '[email protected]',
'age': 30
}
def test_user_validation(sample_user):
assert validate_user(sample_user) is True
def test_user_serialization(sample_user):
serialized = serialize_user(sample_user)
assert serialized['name'] == 'Alice'
```
**Fixture scopes**:
```python
@pytest.fixture(scope='session') # Once per test session
def database_connection():
conn = create_connection()
yield conn
conn.close()
@pytest.fixture(scope='module') # Once per test module
def sample_data():
return load_sample_data()
@pytest.fixture(scope='function') # Default: once per test function
def temp_file():
file = create_temp_file()
yield file
file.delete()
```
**External fixture files**:
```python
# conftest.py (shared across tests)
import pytest
import json
@pytest.fixture
def api_response():
with open('fixtures/api-response.json') as f:
return json.load(f)
# test_api.py
def test_parse_response(api_response):
users = parse_users(api_response)
assert len(users) == 2
```
### Go Example
```go
package products_test
import (
"testing"
"github.com/stretchr/testify/assert"
)
// Fixture: Sample products
var sampleProducts = []Product{
{ID: 1, Name: "Widget", Price: 10.0, Stock: 100},
{ID: 2, Name: "Gadget", Price: 5.0, Stock: 50},
}
func TestFilterByPrice(t *testing.T) {
filtered := FilterByPrice(sampleProducts, PriceFilter{Max: 7.0})
assert.Len(t, filtered, 1)
assert.Equal(t, "Gadget", filtered[0].Name)
}
```
### Rust Example
```rust
#[cfg(test)]
mod tests {
use super::*;
// Fixture: Sample products
fn sample_products() -> Vec<Product> {
vec![
Product { id: 1, name: "Widget".to_string(), price: 10.0, stock: 100 },
Product { id: 2, name: "Gadget".to_string(), price: 5.0, stock: 50 },
]
}
#[test]
fn test_filter_by_price() {
let products = sample_products();
let filtered = filter_by_price(&products, PriceFilter { max: Some(7.0) });
assert_eq!(filtered.len(), 1);
assert_eq!(filtered[0].name, "Gadget");
}
}
```
## Factories (Generated Data)
### What Are Factories?
Functions that generate test data with sensible defaults, allowing customization.
**Characteristics**:
- Generate fresh data for each test
- Customizable (override defaults)
- More flexible than fixtures
- Slightly less deterministic
### When to Use Factories
✅ **Use factories when**:
- Need variety in test data
- Testing multiple scenarios with similar structure
- Want to avoid data coupling between tests
- Need realistic but flexible data
❌ **Avoid factories when**:
- Need exact, reproducible data
- Debugging specific scenarios
- Simple tests with few data variations
### TypeScript/JavaScript Example
**Simple factory**:
```typescript
// factories/user.ts
export function createUser(overrides = {}) {
return {
id: Math.random().toString(),
name: 'Test User',
email: '[email protected]',
createdAt: new Date(),
...overrides
}
}
// user.test.ts
test('creates user with valid email', () => {
const user = createUser({ email: '[email protected]' })
expect(validateUser(user)).toBe(true)
})
test('rejects user with invalid email', () => {
const user = createUser({ email: 'invalid' })
expect(validateUser(user)).toBe(false)
})
```
**Factory with builder pattern**:
```typescript
class UserFactory {
private data = {
id: '',
name: 'Test User',
email: '[email protected]',
role: 'user' as const
}
withId(id: string) {
this.data.id = id
return this
}
withEmail(email: string) {
this.data.email = email
return this
}
asAdmin() {
this.data.role = 'admin'
return this
}
build() {
return { ...this.data, id: this.data.id || Math.random().toString() }
}
}
// Usage
test('admin user has elevated permissions', () => {
const admin = new UserFactory().asAdmin().build()
expect(hasPermission(admin, 'delete-users')).toBe(true)
})
```
### Python Example
**Factory function**:
```python
# factories.py
def create_user(**kwargs):
"""Factory for creating test users"""
defaults = {
'id': uuid.uuid4(),
'name': 'Test User',
'email': '[email protected]',
'created_at': datetime.now()
}
return {**defaults, **kwargs}
# test_users.py
def test_user_validation():
user = create_user(email='[email protected]')
assert validate_user(user) is True
def test_invalid_email():
user = create_user(email='invalid')
assert validate_user(user) is False
```
**Factory Boy (advanced library)**:
```python
import factory
from models import User
class UserFactory(factory.Factory):
class Meta:
model = User
id = factory.Sequence(lambda n: n)
name = factory.Faker('name')
email = factory.Faker('email')
created_at = factory.Faker('date_time_this_year')
# Usage
def test_user_creation():
user = UserFactory.create()
assert user.email is not None
def test_admin_user():
admin = UserFactory.create(role='admin')
assert admin.role == 'admin'
def test_multiple_users():
users = UserFactory.create_batch(10)
assert len(users) == 10
```
### Go Example
```go
// Factory function
func NewTestUser(opts ...func(*User)) *User {
user := &User{
ID: uuid.New().String(),
Name: "Test User",
Email: "[email protected]",
Role: "user",
}
for _, opt := range opts {
opt(user)
}
return user
}
// Option functions
func WithEmail(email string) func(*User) {
return func(u *User) {
u.Email = email
}
}
func AsAdmin() func(*User) {
return func(u *User) {
u.Role = "admin"
}
}
// Usage
func TestAdminPermissions(t *testing.T) {
admin := NewTestUser(AsAdmin(), WithEmail("[email protected]"))
assert.Equal(t, "admin", admin.Role)
}
```
## Property-Based Testing
### What Is Property-Based Testing?
Generate hundreds/thousands of random inputs to find edge cases you didn't think of.
**Characteristics**:
- Discovers unexpected edge cases
- Tests invariants (properties that should always hold)
- Slower than fixture/factory tests
- Failures can be hard to debug (but shrinking helps)
### When to Use Property-Based Testing
✅ **Use property-based testing when**:
- Testing complex algorithms (sorting, parsing, compression)
- Testing mathematical properties (commutativity, associativity)
- Validating parsers/serializers (round-trip properties)
- Finding edge cases in validation logic
❌ **Avoid property-based testing when**:
- Simple, straightforward functions
- Performance-critical test suites (property tests are slow)
- Need deterministic test data
### TypeScript/JavaScript Example (fast-check)
**Basic property test**:
```typescript
import fc from 'fast-check'
test('reversing twice returns original', () => {
fc.assert(
fc.property(fc.array(fc.integer()), (arr) => {
expect(reverse(reverse(arr))).toEqual(arr)
})
)
})
test('sorting is idempotent', () => {
fc.assert(
fc.property(fc.array(fc.integer()), (arr) => {
const sorted = sort(arr)
expect(sort(sorted)).toEqual(sorted)
})
)
})
```
**Round-trip testing**:
```typescript
test('JSON serialization round-trip', () => {
fc.assert(
fc.property(
fc.record({
id: fc.integer(),
name: fc.string(),
active: fc.boolean()
}),
(obj) => {
const serialized = JSON.stringify(obj)
const deserialized = JSON.parse(serialized)
expect(deserialized).toEqual(obj)
}
)
)
})
```
### Python Example (hypothesis)
**Basic property test**:
```python
from hypothesis import given, strategies as st
@given(st.lists(st.integers()))
def test_reverse_reverse_is_identity(lst):
assert reverse(reverse(lst)) == lst
@given(st.lists(st.integers()))
def test_sorting_is_idempotent(lst):
sorted_list = sorted(lst)
assert sorted(sorted_list) == sorted_list
```
**Complex strategies**:
```python
from hypothesis import given, strategies as st
# Generate users with constraints
user_strategy = st.builds(
dict,
id=st.integers(min_value=1),
name=st.text(min_size=1, max_size=100),
email=st.emails(),
age=st.integers(min_value=18, max_value=120)
)
@given(user_strategy)
def test_user_validation(user):
assert validate_user(user) is True
assert user['age'] >= 18
```
**Shrinking example**:
```python
# hypothesis automatically shrinks failing cases to minimal example
@given(st.lists(st.integers()))
def test_no_duplicates_after_dedup(lst):
deduped = remove_duplicates(lst)
assert len(deduped) == len(set(deduped))
# If this fails on [1, 2, 2, 3], hypothesis will shrink to [0, 0]
```
### Rust Example (proptest)
```rust
use proptest::prelude::*;
proptest! {
#[test]
fn test_reverse_twice(ref v in prop::collection::vec(any::<i32>(), 0..100)) {
let reversed_twice: Vec<_> = v.iter().rev().rev().copied().collect();
assert_eq!(v, &reversed_twice);
}
#[test]
fn test_add_is_commutative(a in any::<i32>(), b in any::<i32>()) {
prop_assume!(a.checked_add(b).is_some());
assert_eq!(a + b, b + a);
}
}
```
## Database Seeding
### When to Use Database Seeding
For integration tests that require specific database state.
**Characteristics**:
- Sets up known database state before tests
- Cleans up after tests (transaction rollback or truncate)
- Realistic data for integration testing
- Slower than in-memory tests
### TypeScript/JavaScript Example (Vitest + Postgres)
```typescript
import { beforeEach, afterEach, test, expect } from 'vitest'
import { db } from './database'
beforeEach(async () => {
// Seed database
await db.query('TRUNCATE users, products RESTART IDENTITY CASCADE')
await db.query(`
INSERT INTO users (name, email) VALUES
('Alice', '[email protected]'),
('Bob', '[email protected]')
`)
})
afterEach(async () => {
// Cleanup
await db.query('TRUNCATE users, products RESTART IDENTITY CASCADE')
})
test('fetches all users', async () => {
const users = await fetchUsers()
expect(users).toHaveLength(2)
})
```
### Python Example (pytest + SQLAlchemy)
```python
import pytest
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from models import Base, User
@pytest.fixture(scope='function')
def db_session():
"""Provide a clean database session for each test"""
engine = create_engine('postgresql://localhost/test_db')
Base.metadata.create_all(engine)
Session = sessionmaker(bind=engine)
session = Session()
# Seed data
session.add(User(name='Alice', email='[email protected]'))
session.add(User(name='Bob', email='[email protected]'))
session.commit()
yield session
session.close()
Base.metadata.drop_all(engine)
def test_fetch_users(db_session):
users = db_session.query(User).all()
assert len(users) == 2
```
## Snapshot Testing
### What Is Snapshot Testing?
Capture output and compare against saved snapshots to detect unintended changes.
**When to Use**:
- UI component rendering
- API response structures
- Generated code output
- Configuration files
**Caution**: Easy to over-use. Update snapshots mindfully.
### TypeScript/JavaScript Example (Vitest)
```typescript
import { test, expect } from 'vitest'
import { render } from '@testing-library/react'
import { UserCard } from './UserCard'
test('renders user card correctly', () => {
const user = { name: 'Alice', email: '[email protected]' }
const { container } = render(<UserCard user={user} />)
expect(container.firstChild).toMatchSnapshot()
})
test('generates correct config', () => {
const config = generateConfig({ apiUrl: 'https://api.example.com' })
expect(config).toMatchSnapshot()
})
```
### Python Example (pytest-snapshot)
```python
def test_api_response_structure(snapshot):
response = fetch_user_data(user_id=1)
snapshot.assert_match(response, 'user_response.json')
def test_generated_html(snapshot):
html = generate_email_template(user='Alice', subject='Welcome')
snapshot.assert_match(html, 'welcome_email.html')
```
## Strategy Selection Guide
### Decision Matrix
| Scenario | Recommended Strategy | Rationale |
|----------|---------------------|-----------|
| Known happy path | Fixtures | Deterministic, easy to debug |
| Multiple similar scenarios | Factories | Flexible, avoid duplication |
| Complex algorithm edge cases | Property-Based | Discovers unexpected cases |
| Integration test setup | Database Seeding + Factories | Realistic data, flexible |
| UI component rendering | Snapshot | Detect unintended changes |
| Round-trip validation | Property-Based | Tests invariants |
| Regression test | Fixtures | Ensure specific bug stays fixed |
| Performance testing | Factories | Generate large datasets |
### Combining Strategies
**Example: User authentication testing**
```typescript
// Unit test: Use fixtures for known scenarios
test('validates correct password', () => {
const user = { username: 'alice', passwordHash: 'hash123' }
expect(validatePassword(user, 'password')).toBe(true)
})
// Unit test: Use property-based for edge cases
fc.assert(
fc.property(fc.string(), fc.string(), (username, password) => {
const user = createUser({ username, password })
// Property: Never validate empty password
if (password === '') {
expect(validatePassword(user, password)).toBe(false)
}
})
)
// Integration test: Use factories + database seeding
test('login endpoint returns token', async () => {
const user = await UserFactory.create({ password: 'password123' })
const response = await request(app)
.post('/login')
.send({ username: user.username, password: 'password123' })
expect(response.body.token).toBeDefined()
})
// E2E test: Use fixtures for reproducible scenarios
test('user can login and access dashboard', async ({ page }) => {
// Seed known user
await seedUser({ username: 'alice', password: 'password123' })
await page.goto('/login')
await page.fill('[name="username"]', 'alice')
await page.fill('[name="password"]', 'password123')
await page.click('[type="submit"]')
await expect(page.locator('h1')).toContainText('Dashboard')
})
```
## Summary
**Key Takeaways**:
1. **Fixtures**: Best for known scenarios, regression tests, debugging
2. **Factories**: Best for flexible data, integration tests, avoiding duplication
3. **Property-Based**: Best for complex algorithms, finding edge cases, validating invariants
4. **Database Seeding**: Best for integration tests requiring realistic data
5. **Snapshots**: Best for UI components, configuration files, detecting unintended changes
**Recommendations**:
- Start with fixtures for simple tests
- Use factories as tests grow and need flexibility
- Add property-based tests for critical logic
- Combine strategies for comprehensive coverage
- Update snapshots mindfully (not automatically)
```
### references/mocking-strategies.md
```markdown
# Mocking Strategies
## Table of Contents
1. [Overview](#overview)
2. [When to Mock vs. Use Real Dependencies](#when-to-mock-vs-use-real-dependencies)
3. [Mocking External APIs](#mocking-external-apis)
4. [Mocking Databases](#mocking-databases)
5. [Mocking Time and Randomness](#mocking-time-and-randomness)
6. [Test Doubles Taxonomy](#test-doubles-taxonomy)
7. [Anti-Patterns](#anti-patterns)
## Overview
Mocking is the practice of replacing real dependencies with controlled substitutes during testing. Effective mocking enables fast, isolated tests while maintaining confidence in system behavior.
### The Mocking Decision Matrix
| Dependency | Unit Test | Integration Test | E2E Test |
|------------|-----------|------------------|----------|
| **Database** | Mock (in-memory) | Real (test DB, Docker) | Real (staging DB) |
| **External API** | Mock (MSW, nock) | Mock (MSW, VCR) | Real (or staging) |
| **Filesystem** | Mock (in-memory FS) | Real (temp directory) | Real |
| **Time/Date** | Mock (freezeTime) | Mock (if deterministic) | Real (usually) |
| **Environment Variables** | Mock (setEnv) | Mock (test config) | Real (test env) |
| **Internal Services** | Mock (stub) | Real (or container) | Real |
| **Random Number Generator** | Mock (seeded) | Mock (deterministic) | Real |
| **External Queue** | Mock (in-memory) | Real (test queue) | Real |
## When to Mock vs. Use Real Dependencies
### Use Mock When
✅ **External API calls**:
- Third-party services (payment, shipping, email)
- Avoid rate limits and costs
- Ensure deterministic responses
- Test error scenarios (API down, timeout)
✅ **Slow operations**:
- File I/O (unless testing file handling specifically)
- Network requests
- Heavy computations (not core to test)
✅ **Non-deterministic behavior**:
- Current time (Date.now(), datetime.now())
- Random number generation
- UUIDs and unique IDs
✅ **External dependencies outside your control**:
- Third-party libraries with side effects
- System-level operations (process.exit, os.shutdown)
### Use Real Dependency When
✅ **Database operations** (integration tests):
- Use ephemeral database (Docker, in-memory SQLite)
- Test actual SQL queries
- Validate transaction behavior
- Ensure data integrity
✅ **Internal services** (integration tests):
- Test component interactions
- Validate interface contracts
- Ensure data flow correctness
✅ **File operations** (integration tests):
- Use temporary directories
- Test file parsing, generation
- Validate file handling edge cases
✅ **Core business logic**:
- Never mock the system under test
- Never mock what you're trying to validate
## Mocking External APIs
### TypeScript/JavaScript: Mock Service Worker (MSW)
**Why MSW?**
- Intercepts requests at network level (realistic)
- Same handlers for tests and development
- Framework-agnostic (works with any test library)
- TypeScript support
**Basic Setup**:
```typescript
// mocks/handlers.ts
import { http, HttpResponse } from 'msw'
export const handlers = [
http.get('/api/users/:id', ({ params }) => {
return HttpResponse.json({
id: params.id,
name: 'Test User',
email: '[email protected]'
})
}),
http.post('/api/users', async ({ request }) => {
const body = await request.json()
return HttpResponse.json(
{ id: '123', ...body },
{ status: 201 }
)
})
]
// mocks/server.ts
import { setupServer } from 'msw/node'
import { handlers } from './handlers'
export const server = setupServer(...handlers)
// vitest.setup.ts
import { beforeAll, afterEach, afterAll } from 'vitest'
import { server } from './mocks/server'
beforeAll(() => server.listen())
afterEach(() => server.resetHandlers())
afterAll(() => server.close())
```
**Usage in Tests**:
```typescript
import { test, expect } from 'vitest'
import { server } from './mocks/server'
import { http, HttpResponse } from 'msw'
import { fetchUser, createUser } from './api'
test('fetches user data', async () => {
const user = await fetchUser('123')
expect(user.name).toBe('Test User')
})
test('handles API errors', async () => {
server.use(
http.get('/api/users/:id', () => {
return HttpResponse.json(
{ error: 'User not found' },
{ status: 404 }
)
})
)
await expect(fetchUser('999')).rejects.toThrow('User not found')
})
test('creates user', async () => {
const newUser = await createUser({ name: 'Alice', email: '[email protected]' })
expect(newUser.id).toBe('123')
expect(newUser.name).toBe('Alice')
})
```
**Advanced: Request Matching and Validation**:
```typescript
test('sends correct request body', async () => {
let capturedRequest: any
server.use(
http.post('/api/users', async ({ request }) => {
capturedRequest = await request.json()
return HttpResponse.json({ id: '123', ...capturedRequest })
})
)
await createUser({ name: 'Bob', email: '[email protected]' })
expect(capturedRequest).toEqual({
name: 'Bob',
email: '[email protected]'
})
})
```
### Python: pytest-httpserver
**Setup**:
```python
import pytest
from pytest_httpserver import HTTPServer
@pytest.fixture
def mock_api(httpserver: HTTPServer):
httpserver.expect_request('/api/users/123').respond_with_json({
'id': '123',
'name': 'Test User',
'email': '[email protected]'
})
return httpserver
def test_fetch_user(mock_api):
response = fetch_user('123', base_url=mock_api.url_for(''))
assert response['name'] == 'Test User'
```
**Error Simulation**:
```python
def test_api_error_handling(httpserver):
httpserver.expect_request('/api/users/999').respond_with_json(
{'error': 'User not found'},
status=404
)
with pytest.raises(UserNotFoundError):
fetch_user('999', base_url=httpserver.url_for(''))
```
### Python: VCR.py (Record/Replay)
**Record real API responses, replay in tests**:
```python
import vcr
@vcr.use_cassette('fixtures/vcr_cassettes/user_123.yaml')
def test_fetch_user():
# First run: Makes real API call, records response
# Subsequent runs: Replays recorded response
user = fetch_user('123')
assert user['name'] == 'Test User'
```
**Benefits**:
- Real API responses (realistic data)
- No mocking code needed
- Offline testing (no network required)
**Drawbacks**:
- Cassettes can become stale
- Doesn't test error scenarios easily
- Large responses bloat repository
### Go: httptest
**Standard library mocking**:
```go
func TestFetchUser(t *testing.T) {
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
w.Write([]byte(`{"id": "123", "name": "Test User"}`))
}))
defer server.Close()
user, err := fetchUser("123", server.URL)
assert.NoError(t, err)
assert.Equal(t, "Test User", user.Name)
}
```
### Rust: mockito
```rust
use mockito::{mock, server_url};
#[test]
fn test_fetch_user() {
let _m = mock("GET", "/api/users/123")
.with_status(200)
.with_header("content-type", "application/json")
.with_body(r#"{"id": "123", "name": "Test User"}"#)
.create();
let user = fetch_user("123", &server_url()).unwrap();
assert_eq!(user.name, "Test User");
}
```
## Mocking Databases
### In-Memory SQLite (Fast Unit Tests)
**TypeScript/JavaScript (better-sqlite3)**:
```typescript
import Database from 'better-sqlite3'
import { beforeEach, test, expect } from 'vitest'
let db: Database.Database
beforeEach(() => {
db = new Database(':memory:')
db.exec(`
CREATE TABLE users (
id INTEGER PRIMARY KEY,
name TEXT,
email TEXT
)
`)
})
test('inserts user', () => {
const stmt = db.prepare('INSERT INTO users (name, email) VALUES (?, ?)')
const result = stmt.run('Alice', '[email protected]')
expect(result.changes).toBe(1)
})
```
**Python (SQLite)**:
```python
import sqlite3
import pytest
@pytest.fixture
def db():
conn = sqlite3.connect(':memory:')
conn.execute('''
CREATE TABLE users (
id INTEGER PRIMARY KEY,
name TEXT,
email TEXT
)
''')
yield conn
conn.close()
def test_insert_user(db):
cursor = db.execute('INSERT INTO users (name, email) VALUES (?, ?)',
('Alice', '[email protected]'))
assert cursor.rowcount == 1
```
### Docker Test Containers (Real Database Integration)
**TypeScript/JavaScript (testcontainers)**:
```typescript
import { PostgreSqlContainer } from '@testcontainers/postgresql'
import { beforeAll, afterAll, test } from 'vitest'
import pg from 'pg'
let container: StartedPostgreSqlContainer
let client: pg.Client
beforeAll(async () => {
container = await new PostgreSqlContainer().start()
client = new pg.Client({ connectionString: container.getConnectionUri() })
await client.connect()
await client.query(`
CREATE TABLE users (
id SERIAL PRIMARY KEY,
name VARCHAR(255),
email VARCHAR(255)
)
`)
}, 30000)
afterAll(async () => {
await client.end()
await container.stop()
})
test('creates user in real postgres', async () => {
const result = await client.query(
'INSERT INTO users (name, email) VALUES ($1, $2) RETURNING id',
['Alice', '[email protected]']
)
expect(result.rows[0].id).toBeDefined()
})
```
**Python (testcontainers-python)**:
```python
from testcontainers.postgres import PostgresContainer
import psycopg2
import pytest
@pytest.fixture(scope='module')
def postgres():
with PostgresContainer("postgres:16") as container:
conn = psycopg2.connect(container.get_connection_url())
cursor = conn.cursor()
cursor.execute('''
CREATE TABLE users (
id SERIAL PRIMARY KEY,
name VARCHAR(255),
email VARCHAR(255)
)
''')
conn.commit()
yield conn
conn.close()
def test_create_user(postgres):
cursor = postgres.cursor()
cursor.execute('INSERT INTO users (name, email) VALUES (%s, %s) RETURNING id',
('Alice', '[email protected]'))
user_id = cursor.fetchone()[0]
assert user_id is not None
```
### Database Mocking Libraries
**TypeScript: mock-knex**:
```typescript
import knex from 'knex'
import mockKnex from 'mock-knex'
const db = knex({ client: 'pg' })
mockKnex.mock(db)
test('mocks database query', async () => {
const tracker = mockKnex.getTracker()
tracker.install()
tracker.on('query', (query) => {
query.response([{ id: 1, name: 'Alice' }])
})
const users = await db('users').select()
expect(users[0].name).toBe('Alice')
tracker.uninstall()
})
```
## Mocking Time and Randomness
### Mocking Time (Deterministic Tests)
**TypeScript/JavaScript: Vitest useFakeTimers**:
```typescript
import { test, expect, vi, beforeEach, afterEach } from 'vitest'
beforeEach(() => {
vi.useFakeTimers()
})
afterEach(() => {
vi.useRealTimers()
})
test('schedules task for future', () => {
const callback = vi.fn()
setTimeout(callback, 1000)
// Time hasn't passed yet
expect(callback).not.toHaveBeenCalled()
// Fast-forward 1 second
vi.advanceTimersByTime(1000)
expect(callback).toHaveBeenCalledOnce()
})
test('freezes time at specific date', () => {
const mockDate = new Date('2025-01-01T00:00:00Z')
vi.setSystemTime(mockDate)
expect(new Date()).toEqual(mockDate)
// Time doesn't advance
setTimeout(() => {}, 5000)
vi.advanceTimersByTime(5000)
expect(new Date()).toEqual(mockDate)
})
```
**Python: freezegun**:
```python
from freezegun import freeze_time
from datetime import datetime
@freeze_time("2025-01-01 00:00:00")
def test_time_frozen():
assert datetime.now().year == 2025
assert datetime.now().month == 1
@freeze_time("2025-01-01")
def test_date_calculation():
# Time is frozen, calculations are deterministic
expiry = calculate_expiry_date(days=30)
assert expiry.day == 31
```
**Python: unittest.mock (patch time)**:
```python
from unittest.mock import patch
from datetime import datetime
@patch('mymodule.datetime')
def test_current_time(mock_datetime):
mock_datetime.now.return_value = datetime(2025, 1, 1, 12, 0, 0)
result = get_current_timestamp()
assert result == '2025-01-01 12:00:00'
```
### Mocking Random Number Generation
**TypeScript/JavaScript**:
```typescript
import { test, expect, vi } from 'vitest'
test('generates predictable random numbers', () => {
const mockRandom = vi.spyOn(Math, 'random')
mockRandom.mockReturnValueOnce(0.5)
const result = generateId() // Uses Math.random()
expect(result).toBe('expected-id-for-0.5')
mockRandom.mockRestore()
})
```
**Python**:
```python
from unittest.mock import patch
@patch('random.randint')
def test_random_generation(mock_randint):
mock_randint.return_value = 42
result = generate_random_id()
assert result == 'id-42'
```
## Test Doubles Taxonomy
### Types of Test Doubles
**Dummy**: Passed but never used
```typescript
test('creates user without password validation', () => {
const dummyValidator = null // Never called
const user = createUser({ name: 'Alice' }, dummyValidator)
expect(user.name).toBe('Alice')
})
```
**Stub**: Returns predetermined values
```typescript
const stubDatabase = {
findUser: () => ({ id: '123', name: 'Alice' })
}
test('fetches user from stubbed database', () => {
const user = userService.getUser('123', stubDatabase)
expect(user.name).toBe('Alice')
})
```
**Spy**: Records calls and arguments
```typescript
test('calls email service with correct arguments', () => {
const emailSpy = vi.fn()
sendWelcomeEmail('[email protected]', emailSpy)
expect(emailSpy).toHaveBeenCalledWith({
to: '[email protected]',
subject: 'Welcome',
body: expect.any(String)
})
})
```
**Mock**: Stub + Spy (preset responses, verify calls)
```typescript
test('authenticates user and logs event', () => {
const authMock = vi.fn().mockResolvedValue({ authenticated: true })
const logMock = vi.fn()
await loginUser('alice', 'password', authMock, logMock)
expect(authMock).toHaveBeenCalledWith('alice', 'password')
expect(logMock).toHaveBeenCalledWith('login', { user: 'alice' })
})
```
**Fake**: Working implementation (simplified)
```typescript
class FakeDatabase {
private data = new Map()
async save(id: string, value: any) {
this.data.set(id, value)
}
async find(id: string) {
return this.data.get(id)
}
}
test('saves and retrieves from fake database', async () => {
const db = new FakeDatabase()
await db.save('123', { name: 'Alice' })
const user = await db.find('123')
expect(user.name).toBe('Alice')
})
```
## Anti-Patterns
### Over-Mocking (Testing Implementation, Not Behavior)
**❌ Bad: Mocking everything**:
```typescript
test('bad: over-mocked', () => {
const mockAdd = vi.fn((a, b) => a + b)
const mockMultiply = vi.fn((a, b) => a * b)
const result = calculate(5, 3, mockAdd, mockMultiply)
expect(mockAdd).toHaveBeenCalledWith(5, 3)
expect(mockMultiply).toHaveBeenCalled()
// Not testing actual behavior, just mocks!
})
```
**✅ Good: Test behavior, mock external dependencies only**:
```typescript
test('good: test behavior', () => {
const result = calculate(5, 3)
expect(result).toBe(23) // Test actual result
})
```
### Mocking What You're Testing
**❌ Bad: Mocking the system under test**:
```typescript
test('bad: mocks the function being tested', () => {
const mockCalculate = vi.fn(() => 42)
expect(mockCalculate()).toBe(42) // Meaningless!
})
```
**✅ Good: Mock dependencies, test real implementation**:
```typescript
test('good: test real function with mocked dependencies', () => {
const mockFetch = vi.fn().mockResolvedValue({ data: [1, 2, 3] })
const result = await processData(mockFetch)
expect(result.length).toBe(3)
})
```
### Brittle Mocks (Coupled to Implementation)
**❌ Bad: Mocks too specific to implementation**:
```typescript
test('bad: brittle mock', () => {
const mock = vi.fn()
processUser(mock)
expect(mock).toHaveBeenCalledWith('step1')
expect(mock).toHaveBeenCalledWith('step2')
expect(mock).toHaveBeenCalledWith('step3')
// Breaks if internal steps change
})
```
**✅ Good: Test outcomes, not internal steps**:
```typescript
test('good: test outcomes', () => {
const result = processUser()
expect(result.status).toBe('completed')
expect(result.errors).toHaveLength(0)
})
```
### Mock Leakage (Not Resetting Mocks)
**❌ Bad: Mocks persist across tests**:
```typescript
const mockFetch = vi.fn()
test('test 1', () => {
mockFetch.mockResolvedValue({ data: 'test1' })
// ...
})
test('test 2', () => {
// mockFetch still has behavior from test 1! 🐛
})
```
**✅ Good: Reset mocks between tests**:
```typescript
import { beforeEach, afterEach } from 'vitest'
const mockFetch = vi.fn()
beforeEach(() => {
mockFetch.mockReset()
})
// Or use Vitest's global resetMocks
```
## Summary
**Key Takeaways**:
1. **Mock external dependencies**: APIs, time, randomness
2. **Use real dependencies for integration tests**: Databases (ephemeral), internal services
3. **Never mock the system under test**: Test real behavior, not mocks
4. **Choose the right test double**: Dummy, stub, spy, mock, fake
5. **Avoid over-mocking**: Only mock what's necessary
6. **Reset mocks between tests**: Prevent test pollution
**Tools by Language**:
- **TypeScript/JavaScript**: MSW (APIs), Vitest mocks/spies, useFakeTimers
- **Python**: pytest-httpserver, VCR.py, freezegun, unittest.mock
- **Go**: httptest (stdlib), testify/mock
- **Rust**: mockito, mockall
**Next Steps**:
- Identify slow/flaky tests caused by external dependencies
- Replace with appropriate mocks (MSW for APIs, Docker for databases)
- Verify mocks are reset between tests
- Refactor over-mocked tests to test behavior, not implementation
```
### references/coverage-strategies.md
```markdown
# Coverage and Quality Metrics
## Table of Contents
1. [Meaningful Coverage Targets](#meaningful-coverage-targets)
2. [Coverage Tools](#coverage-tools)
3. [Mutation Testing](#mutation-testing)
4. [Beyond Line Coverage](#beyond-line-coverage)
## Meaningful Coverage Targets
### Anti-Pattern: 100% Coverage Goal
**Problem**: Chasing 100% coverage leads to:
- Testing trivial code (getters/setters)
- Testing framework code
- False sense of security
- Wasted effort
**Better Approach**: Risk-based coverage targets
### Recommended Targets
| Code Type | Target | Rationale |
|-----------|--------|-----------|
| **Critical Business Logic** | 90%+ | Payment, auth, data integrity |
| **API Endpoints** | 80%+ | Public interfaces |
| **Utility Functions** | 70%+ | Commonly reused code |
| **UI Components** | 60%+ | Focus on logic, not markup |
| **Overall Project** | 70-80% | Balanced coverage |
### What NOT to Test
- Simple getters/setters
- Framework-generated code
- Third-party libraries
- Configuration files
- Trivial pass-through functions
## Coverage Tools
### TypeScript/JavaScript
**Vitest Coverage** (recommended):
```bash
npm install -D @vitest/coverage-v8
vitest --coverage
```
**Configuration**:
```typescript
// vitest.config.ts
export default {
test: {
coverage: {
provider: 'v8',
reporter: ['text', 'html', 'json'],
include: ['src/**/*.ts'],
exclude: ['**/*.test.ts', '**/*.spec.ts'],
lines: 70,
functions: 70,
branches: 70,
statements: 70
}
}
}
```
### Python
**pytest-cov**:
```bash
pip install pytest-cov
pytest --cov=src --cov-report=html --cov-report=term
```
**Configuration (.coveragerc)**:
```ini
[run]
omit = */tests/*, */migrations/*
[report]
exclude_lines =
pragma: no cover
def __repr__
raise AssertionError
raise NotImplementedError
```
### Go
**Built-in coverage**:
```bash
go test -cover ./...
go test -coverprofile=coverage.out ./...
go tool cover -html=coverage.out
```
### Rust
**cargo-tarpaulin**:
```bash
cargo install cargo-tarpaulin
cargo tarpaulin --out Html
```
## Mutation Testing
### What Is Mutation Testing?
Mutation testing validates test quality by introducing bugs (mutations) and checking if tests catch them.
**Example**:
```typescript
// Original code
function add(a, b) {
return a + b
}
// Mutation 1: Change + to -
function add(a, b) {
return a - b // If tests still pass, they're weak!
}
// Mutation 2: Change return value
function add(a, b) {
return 0 // If tests still pass, they're weak!
}
```
### Mutation Testing Tools
**TypeScript/JavaScript: Stryker Mutator**:
```bash
npm install -D @stryker-mutator/core @stryker-mutator/vitest-runner
npx stryker run
```
**Python: mutmut**:
```bash
pip install mutmut
mutmut run
```
**Rust: cargo-mutants**:
```bash
cargo install cargo-mutants
cargo mutants
```
### When to Use Mutation Testing
✅ **Use mutation testing for**:
- High-criticality code (payment, auth)
- Core business logic
- Validating test suite quality
❌ **Skip mutation testing for**:
- Simple CRUD operations
- UI components (slow, low value)
- Integration tests (too slow)
## Beyond Line Coverage
### Branch Coverage
Ensure all conditional branches are tested:
```typescript
function getUserStatus(age: number) {
if (age < 18) {
return 'minor'
} else {
return 'adult'
}
}
// Need tests for BOTH branches:
test('returns minor for age < 18', () => {
expect(getUserStatus(15)).toBe('minor')
})
test('returns adult for age >= 18', () => {
expect(getUserStatus(25)).toBe('adult')
})
```
### Function Coverage
Ensure all functions are called at least once.
### Statement Coverage
Ensure all statements are executed.
## CI/CD Integration
### Enforce Coverage Thresholds
**GitHub Actions**:
```yaml
- name: Run tests with coverage
run: npm run test:coverage
- name: Check coverage threshold
run: |
COVERAGE=$(cat coverage/coverage-summary.json | jq '.total.lines.pct')
if (( $(echo "$COVERAGE < 70" | bc -l) )); then
echo "Coverage $COVERAGE% is below 70%"
exit 1
fi
```
### Track Coverage Trends
Use tools like Codecov or Coveralls to track coverage over time:
- Prevent coverage regressions
- Visualize coverage trends
- Comment coverage changes on PRs
## Summary
**Key Takeaways**:
1. **Don't chase 100% coverage**: Focus on meaningful coverage
2. **Risk-based targets**: Higher coverage for critical code
3. **Use mutation testing**: Validate test quality, not just coverage
4. **Enforce in CI/CD**: Prevent coverage regressions
5. **Track trends**: Monitor coverage over time
**Next Steps**:
- Set realistic coverage targets for your project
- Configure coverage tools in your test framework
- Consider mutation testing for critical paths
- Integrate coverage reporting into CI/CD
```