readiness-report
Evaluate how well a codebase supports autonomous AI development. Analyzes repositories across nine technical pillars (Style & Validation, Build System, Testing, Documentation, Dev Environment, Debugging & Observability, Security, Task Discovery, Product & Analytics) and five maturity levels. Use when users request `/readiness-report` or want to assess agent readiness, codebase maturity, or identify gaps preventing effective AI-assisted development.
Packaged view
This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.
Install command
npx @skill-hub/cli install aojdevstudio-finance-guru-readiness-report
Repository
Skill path: .agents/skills/readiness-report
Evaluate how well a codebase supports autonomous AI development. Analyzes repositories across nine technical pillars (Style & Validation, Build System, Testing, Documentation, Dev Environment, Debugging & Observability, Security, Task Discovery, Product & Analytics) and five maturity levels. Use when users request `/readiness-report` or want to assess agent readiness, codebase maturity, or identify gaps preventing effective AI-assisted development.
Open repositoryBest for
Primary workflow: Research & Ops.
Technical facets: Full Stack, Data / AI, Security, Testing.
Target audience: everyone.
License: Unknown.
Original source
Catalog source: SkillHub Club.
Repository owner: AojdevStudio.
This is still a mirrored public skill entry. Review the repository before installing into production workflows.
What it helps with
- Install readiness-report into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
- Review https://github.com/AojdevStudio/Finance-Guru before adding readiness-report to shared team environments
- Use readiness-report for development workflows
Works across
Favorites: 0.
Sub-skills: 0.
Aggregator: No.
Original source / Raw SKILL.md
---
name: readiness-report
description: Evaluate how well a codebase supports autonomous AI development. Analyzes repositories across nine technical pillars (Style & Validation, Build System, Testing, Documentation, Dev Environment, Debugging & Observability, Security, Task Discovery, Product & Analytics) and five maturity levels. Use when users request `/readiness-report` or want to assess agent readiness, codebase maturity, or identify gaps preventing effective AI-assisted development.
triggers:
- /readiness-report
---
# Agent Readiness Report
Evaluate how well a repository supports autonomous AI development by analyzing it across nine technical pillars and five maturity levels.
## Overview
Agent Readiness measures how prepared a codebase is for AI-assisted development. Poor feedback loops, missing documentation, or lack of tooling cause agents to waste cycles on preventable errors. This skill identifies those gaps and prioritizes fixes.
## Quick Start
The user will run `/readiness-report` to evaluate the current repository. The agent will then:
1. Clone the repo, scan repository structure, CI configs, and tooling
2. Evaluate 81 criteria across 9 technical pillars
3. Determine maturity level (L1-L5) based on 80% threshold per level
4. Provide prioritized recommendations
## Workflow
### Step 1: Run Repository Analysis
Execute the analysis script to gather signals from the repository:
```bash
python scripts/analyze_repo.py --repo-path .
```
This script checks for:
- Configuration files (.eslintrc, pyproject.toml, etc.)
- CI/CD workflows (.github/workflows/, .gitlab-ci.yml)
- Documentation (README, AGENTS.md, CONTRIBUTING.md)
- Test infrastructure (test directories, coverage configs)
- Security configurations (CODEOWNERS, .gitignore, secrets management)
### Step 2: Generate Report
After analysis, generate the formatted report:
```bash
python scripts/generate_report.py --analysis-file /tmp/readiness_analysis.json
```
### Step 3: Present Results
The report includes:
1. **Overall Score**: Pass rate percentage and maturity level achieved
2. **Level Progress**: Bar showing L1-L5 completion percentages
3. **Strengths**: Top-performing pillars with passing criteria
4. **Opportunities**: Prioritized list of improvements to implement
5. **Detailed Criteria**: Full breakdown by pillar showing each criterion status
## Nine Technical Pillars
Each pillar addresses specific failure modes in AI-assisted development:
| Pillar | Purpose | Key Signals |
|--------|---------|-------------|
| **Style & Validation** | Catch bugs instantly | Linters, formatters, type checkers |
| **Build System** | Fast, reliable builds | Build docs, CI speed, automation |
| **Testing** | Verify correctness | Unit/integration tests, coverage |
| **Documentation** | Guide the agent | AGENTS.md, README, architecture docs |
| **Dev Environment** | Reproducible setup | Devcontainer, env templates |
| **Debugging & Observability** | Diagnose issues | Logging, tracing, metrics |
| **Security** | Protect the codebase | CODEOWNERS, secrets management |
| **Task Discovery** | Find work to do | Issue templates, PR templates |
| **Product & Analytics** | Error-to-insight loop | Error tracking, product analytics |
See `references/criteria.md` for the complete list of 81 criteria per pillar.
## Five Maturity Levels
| Level | Name | Description | Agent Capability |
|-------|------|-------------|------------------|
| L1 | Initial | Basic version control | Manual assistance only |
| L2 | Managed | Basic CI/CD and testing | Simple, well-defined tasks |
| L3 | Standardized | Production-ready for agents | Routine maintenance |
| L4 | Measured | Comprehensive automation | Complex features |
| L5 | Optimized | Full autonomous capability | End-to-end development |
**Level Progression**: To unlock a level, pass ≥80% of criteria at that level AND all previous levels.
See `references/maturity-levels.md` for detailed level requirements.
## Interpreting Results
### Pass vs Fail vs Skip
- ✓ **Pass**: Criterion met (contributes to score)
- ✗ **Fail**: Criterion not met (opportunity for improvement)
- — **Skip**: Not applicable to this repository type (excluded from score)
### Priority Order
Fix gaps in this order:
1. **L1-L2 failures**: Foundation issues blocking basic agent operation
2. **L3 failures**: Production readiness gaps
3. **High-impact L4+ failures**: Optimization opportunities
### Common Quick Wins
1. **Add AGENTS.md**: Document commands, architecture, and workflows for AI agents
2. **Configure pre-commit hooks**: Catch style issues before CI
3. **Add PR/issue templates**: Structure task discovery
4. **Document single-command setup**: Enable fast environment provisioning
## Resources
- `scripts/analyze_repo.py` - Repository analysis script
- `scripts/generate_report.py` - Report generation and formatting
- `references/criteria.md` - Complete criteria definitions by pillar
- `references/maturity-levels.md` - Detailed level requirements
## Automated Remediation
After reviewing the report, common fixes can be automated:
- Generate AGENTS.md from repository structure
- Add missing issue/PR templates
- Configure standard linters and formatters
- Set up pre-commit hooks
Ask to "fix readiness gaps" to begin automated remediation of failing criteria.
---
## Referenced Files
> The following files are referenced in this skill and included for context.
### scripts/analyze_repo.py
```python
#!/usr/bin/env python3
"""Repository Readiness Analyzer.
Analyzes a repository across eight technical pillars to determine
agent readiness. Outputs a JSON file with detailed criteria evaluation.
Usage:
python analyze_repo.py --repo-path /path/to/repo
python analyze_repo.py --repo-path . --output /tmp/analysis.json
"""
import argparse
import json
import os
import re
import subprocess
from dataclasses import asdict, dataclass, field
from enum import StrEnum
from pathlib import Path
class CriterionStatus(StrEnum):
PASS = "pass"
FAIL = "fail"
SKIP = "skip"
@dataclass
class CriterionResult:
id: str
pillar: str
level: int
status: CriterionStatus
score: str # "1/1", "0/1", or "—/—"
reason: str
@dataclass
class PillarResult:
name: str
passed: int
total: int
criteria: list[CriterionResult] = field(default_factory=list)
@property
def percentage(self) -> int:
if self.total == 0:
return 100
return int((self.passed / self.total) * 100)
@dataclass
class AnalysisResult:
repo_path: str
repo_name: str
pillars: dict[str, PillarResult] = field(default_factory=dict)
level_scores: dict[int, float] = field(default_factory=dict)
achieved_level: int = 1
pass_rate: float = 0.0
total_passed: int = 0
total_criteria: int = 0
repo_type: str = "application" # library, cli, database, monorepo, application
languages: list[str] = field(default_factory=list)
class RepoAnalyzer:
"""Analyzes repository for agent readiness criteria."""
def __init__(self, repo_path: str):
self.repo_path = Path(repo_path).resolve()
self.result = AnalysisResult(
repo_path=str(self.repo_path), repo_name=self.repo_path.name
)
self._file_cache: dict[str, bool] = {}
self._content_cache: dict[str, str] = {}
def analyze(self) -> AnalysisResult:
"""Run full analysis and return results."""
self._detect_repo_type()
self._detect_languages()
self._evaluate_all_pillars()
self._calculate_levels()
return self.result
def _file_exists(self, *patterns: str) -> bool:
"""Check if any of the given file patterns exist."""
for pattern in patterns:
cache_key = f"exists:{pattern}"
if cache_key in self._file_cache:
if self._file_cache[cache_key]:
return True
continue
# Handle glob patterns
if "*" in pattern:
matches = list(self.repo_path.glob(pattern))
exists = len(matches) > 0
else:
exists = (self.repo_path / pattern).exists()
self._file_cache[cache_key] = exists
if exists:
return True
return False
def _read_file(self, path: str) -> str | None:
"""Read file content, with caching."""
if path in self._content_cache:
return self._content_cache[path]
full_path = self.repo_path / path
if not full_path.exists():
return None
try:
content = full_path.read_text(errors="ignore")
self._content_cache[path] = content
return content
except Exception:
return None
def _search_files(self, pattern: str, content_pattern: str = None) -> bool:
"""Search for files matching pattern, optionally with content."""
matches = list(self.repo_path.glob(pattern))
if not matches:
return False
if content_pattern is None:
return True
regex = re.compile(content_pattern, re.IGNORECASE)
for match in matches[:10]: # Limit for performance
try:
content = match.read_text(errors="ignore")
if regex.search(content):
return True
except Exception:
continue
return False
def _run_command(self, cmd: list[str], timeout: int = 10) -> tuple[int, str]:
"""Run a command and return (exit_code, output)."""
try:
result = subprocess.run(
cmd, cwd=self.repo_path, capture_output=True, text=True, timeout=timeout
)
return result.returncode, result.stdout + result.stderr
except Exception as e:
return -1, str(e)
def _detect_repo_type(self):
"""Detect repository type for criterion skipping."""
# Check for library indicators
if (
self._file_exists("setup.py", "setup.cfg")
and not self._file_exists("Dockerfile")
and "library" in str(self._read_file("setup.py") or "").lower()
):
self.result.repo_type = "library"
return
if self._file_exists("pyproject.toml"):
content = self._read_file("pyproject.toml") or ""
if "[project]" in content and "Dockerfile" not in os.listdir(
self.repo_path
):
# Likely a library
readme = self._read_file("README.md") or ""
if "pip install" in readme.lower() and "docker" not in readme.lower():
self.result.repo_type = "library"
return
# Check for CLI tool
if self._file_exists("**/cli.py", "**/main.py", "**/cmd/**"):
readme = self._read_file("README.md") or ""
if any(x in readme.lower() for x in ["command line", "cli", "usage:"]):
self.result.repo_type = "cli"
return
# Check for database project
if (
"database" in self.result.repo_name.lower()
or "db" in self.result.repo_name.lower()
):
self.result.repo_type = "database"
return
# Check for monorepo
if self._file_exists(
"packages/*", "apps/*", "lerna.json", "pnpm-workspace.yaml"
):
self.result.repo_type = "monorepo"
return
self.result.repo_type = "application"
def _detect_languages(self):
"""Detect primary programming languages."""
languages = []
if self._file_exists("*.py", "**/*.py", "pyproject.toml", "setup.py"):
languages.append("Python")
if self._file_exists("*.ts", "**/*.ts", "tsconfig.json"):
languages.append("TypeScript")
if (
self._file_exists("*.js", "**/*.js", "package.json")
and "TypeScript" not in languages
):
languages.append("JavaScript")
if self._file_exists("*.go", "**/*.go", "go.mod"):
languages.append("Go")
if self._file_exists("*.rs", "**/*.rs", "Cargo.toml"):
languages.append("Rust")
if self._file_exists("*.java", "**/*.java", "pom.xml", "build.gradle"):
languages.append("Java")
if self._file_exists("*.rb", "**/*.rb", "Gemfile"):
languages.append("Ruby")
if self._file_exists("*.cpp", "*.cc", "**/*.cpp", "CMakeLists.txt"):
languages.append("C++")
self.result.languages = languages if languages else ["Unknown"]
def _should_skip(self, criterion_id: str) -> tuple[bool, str]:
"""Determine if a criterion should be skipped based on repo type."""
repo_type = self.result.repo_type
skip_rules = {
"library": [
("health_checks", "Library, not a deployed service"),
("progressive_rollout", "Not applicable for a library"),
("rollback_automation", "Not applicable for a library"),
("dast_scanning", "Library, not a web service"),
("alerting_configured", "Library without runtime"),
("deployment_observability", "Library without deployments"),
("metrics_collection", "Library without runtime"),
("profiling_instrumentation", "Library where profiling not meaningful"),
("circuit_breakers", "Library without external dependencies"),
("distributed_tracing", "Library without runtime"),
("local_services_setup", "Library without external dependencies"),
("database_schema", "Library without database"),
("n_plus_one_detection", "Library without database/ORM"),
("privacy_compliance", "Library without user data"),
("pii_handling", "Library without user data"),
],
"database": [
("n_plus_one_detection", "Database project IS the database layer"),
("dast_scanning", "Database server, not web application"),
],
"cli": [
("dast_scanning", "CLI tool, not web application"),
("health_checks", "CLI tool, not a service"),
("progressive_rollout", "CLI tool without deployments"),
],
}
for rule_type, rules in skip_rules.items():
if repo_type == rule_type:
for crit_id, reason in rules:
if criterion_id == crit_id:
return True, reason
# Skip monorepo criteria for non-monorepos
if repo_type != "monorepo" and criterion_id in [
"monorepo_tooling",
"version_drift_detection",
]:
return True, "Single-application repository, not a monorepo"
# Skip prerequisites
if criterion_id == "devcontainer_runnable" and not self._file_exists(
".devcontainer/devcontainer.json"
):
return True, "No devcontainer to test (prerequisite failed)"
if criterion_id == "agents_md_validation" and not self._file_exists(
"AGENTS.md", "CLAUDE.md"
):
return True, "No AGENTS.md exists (prerequisite failed)"
if (
criterion_id == "dead_feature_flag_detection"
and not self._check_feature_flags()
):
return True, "No feature flag infrastructure (prerequisite failed)"
return False, ""
def _check_feature_flags(self) -> bool:
"""Check if feature flag infrastructure exists."""
# Check for common feature flag services
patterns = [
"launchdarkly",
"statsig",
"unleash",
"growthbook",
"feature.flag",
"featureflag",
"feature_flag",
]
for pattern in ["package.json", "requirements.txt", "go.mod", "Gemfile"]:
content = self._read_file(pattern)
if content:
for flag_pattern in patterns:
if flag_pattern in content.lower():
return True
return False
def _evaluate_all_pillars(self):
"""Evaluate all criteria across all pillars."""
pillars = {
"Style & Validation": self._evaluate_style_validation,
"Build System": self._evaluate_build_system,
"Testing": self._evaluate_testing,
"Documentation": self._evaluate_documentation,
"Dev Environment": self._evaluate_dev_environment,
"Debugging & Observability": self._evaluate_observability,
"Security": self._evaluate_security,
"Task Discovery": self._evaluate_task_discovery,
"Product & Analytics": self._evaluate_product_analytics,
}
for pillar_name, evaluate_func in pillars.items():
criteria = evaluate_func()
passed = sum(1 for c in criteria if c.status == CriterionStatus.PASS)
total = sum(1 for c in criteria if c.status != CriterionStatus.SKIP)
self.result.pillars[pillar_name] = PillarResult(
name=pillar_name, passed=passed, total=total, criteria=criteria
)
self.result.total_passed += passed
self.result.total_criteria += total
if self.result.total_criteria > 0:
self.result.pass_rate = round(
(self.result.total_passed / self.result.total_criteria) * 100, 1
)
def _make_result(
self, criterion_id: str, pillar: str, level: int, passed: bool, reason: str
) -> CriterionResult:
"""Create a criterion result, handling skips."""
should_skip, skip_reason = self._should_skip(criterion_id)
if should_skip:
return CriterionResult(
id=criterion_id,
pillar=pillar,
level=level,
status=CriterionStatus.SKIP,
score="—/—",
reason=skip_reason,
)
return CriterionResult(
id=criterion_id,
pillar=pillar,
level=level,
status=CriterionStatus.PASS if passed else CriterionStatus.FAIL,
score="1/1" if passed else "0/1",
reason=reason,
)
def _evaluate_style_validation(self) -> list[CriterionResult]: # noqa: C901
"""Evaluate Style & Validation pillar."""
pillar = "Style & Validation"
results = []
# L1: formatter
formatter_found = self._file_exists(
".prettierrc",
".prettierrc.json",
".prettierrc.js",
"prettier.config.js",
"pyproject.toml",
".black.toml", # Black/Ruff
".gofmt", # Go uses gofmt by default
"rustfmt.toml",
".rustfmt.toml",
)
if not formatter_found:
# Check pyproject.toml for ruff format
pyproject = self._read_file("pyproject.toml") or ""
formatter_found = (
"ruff" in pyproject.lower() or "black" in pyproject.lower()
)
results.append(
self._make_result(
"formatter",
pillar,
1,
formatter_found,
"Formatter configured"
if formatter_found
else "No formatter config found",
)
)
# L1: lint_config
lint_found = self._file_exists(
".eslintrc",
".eslintrc.js",
".eslintrc.json",
".eslintrc.yaml",
"eslint.config.js",
"eslint.config.mjs",
".pylintrc",
"pylintrc",
"golangci.yml",
".golangci.yml",
".golangci.yaml",
)
if not lint_found:
pyproject = self._read_file("pyproject.toml") or ""
lint_found = "ruff" in pyproject.lower() or "pylint" in pyproject.lower()
results.append(
self._make_result(
"lint_config",
pillar,
1,
lint_found,
"Linter configured" if lint_found else "No linter config found",
)
)
# L1: type_check
type_check = False
if "Go" in self.result.languages or "Rust" in self.result.languages:
type_check = True # Statically typed by default
elif self._file_exists("tsconfig.json"):
type_check = True
elif self._file_exists("pyproject.toml"):
content = self._read_file("pyproject.toml") or ""
type_check = "mypy" in content.lower()
results.append(
self._make_result(
"type_check",
pillar,
1,
type_check,
"Type checking configured" if type_check else "No type checking found",
)
)
# L2: strict_typing
strict_typing = False
if "Go" in self.result.languages or "Rust" in self.result.languages:
strict_typing = True
elif self._file_exists("tsconfig.json"):
content = self._read_file("tsconfig.json") or ""
strict_typing = '"strict": true' in content or '"strict":true' in content
elif self._file_exists("pyproject.toml"):
content = self._read_file("pyproject.toml") or ""
strict_typing = "strict = true" in content or "strict=true" in content
results.append(
self._make_result(
"strict_typing",
pillar,
2,
strict_typing,
"Strict typing enabled"
if strict_typing
else "Strict typing not enabled",
)
)
# L2: pre_commit_hooks
pre_commit = self._file_exists(
".pre-commit-config.yaml", ".pre-commit-config.yml", ".husky", ".husky/*"
)
results.append(
self._make_result(
"pre_commit_hooks",
pillar,
2,
pre_commit,
"Pre-commit hooks configured"
if pre_commit
else "No pre-commit hooks found",
)
)
# L2: naming_consistency
naming = False
eslint = self._read_file(".eslintrc.json") or self._read_file(".eslintrc") or ""
if "naming" in eslint.lower() or "@typescript-eslint/naming" in eslint:
naming = True
agents_md = self._read_file("AGENTS.md") or self._read_file("CLAUDE.md") or ""
if "naming" in agents_md.lower() or "convention" in agents_md.lower():
naming = True
# Go uses stdlib naming by default
if "Go" in self.result.languages:
naming = True
results.append(
self._make_result(
"naming_consistency",
pillar,
2,
naming,
"Naming conventions enforced"
if naming
else "No naming convention enforcement",
)
)
# L2: large_file_detection
large_file = self._file_exists(".gitattributes", ".lfsconfig")
if not large_file:
pre_commit_cfg = self._read_file(".pre-commit-config.yaml") or ""
large_file = "check-added-large-files" in pre_commit_cfg
results.append(
self._make_result(
"large_file_detection",
pillar,
2,
large_file,
"Large file detection configured"
if large_file
else "No large file detection",
)
)
# L3: code_modularization
modular = self._file_exists(".importlinter", "nx.json", "BUILD.bazel", "BUILD")
results.append(
self._make_result(
"code_modularization",
pillar,
3,
modular,
"Module boundaries enforced"
if modular
else "No module boundary enforcement",
)
)
# L3: cyclomatic_complexity
complexity = False
for config in [".golangci.yml", ".golangci.yaml", "pyproject.toml"]:
content = self._read_file(config) or ""
if any(
x in content.lower()
for x in ["gocyclo", "mccabe", "complexity", "radon"]
):
complexity = True
break
results.append(
self._make_result(
"cyclomatic_complexity",
pillar,
3,
complexity,
"Complexity analysis configured"
if complexity
else "No complexity analysis",
)
)
# L3: dead_code_detection
dead_code = False
workflows = list(self.repo_path.glob(".github/workflows/*.yml")) + list(
self.repo_path.glob(".github/workflows/*.yaml")
)
for wf in workflows[:5]:
content = wf.read_text(errors="ignore")
if any(x in content.lower() for x in ["vulture", "knip", "deadcode"]):
dead_code = True
break
results.append(
self._make_result(
"dead_code_detection",
pillar,
3,
dead_code,
"Dead code detection enabled"
if dead_code
else "No dead code detection",
)
)
# L3: duplicate_code_detection
duplicate = False
for wf in workflows[:5]:
content = wf.read_text(errors="ignore")
if any(x in content.lower() for x in ["jscpd", "pmd cpd", "sonarqube"]):
duplicate = True
break
results.append(
self._make_result(
"duplicate_code_detection",
pillar,
3,
duplicate,
"Duplicate detection enabled"
if duplicate
else "No duplicate detection",
)
)
# L4: tech_debt_tracking
tech_debt = False
for wf in workflows[:5]:
content = wf.read_text(errors="ignore")
if any(x in content.lower() for x in ["todo", "fixme", "sonar"]):
tech_debt = True
break
results.append(
self._make_result(
"tech_debt_tracking",
pillar,
4,
tech_debt,
"Tech debt tracking enabled" if tech_debt else "No tech debt tracking",
)
)
# L4: n_plus_one_detection
n_plus_one = False
deps = (
(self._read_file("requirements.txt") or "")
+ (self._read_file("Gemfile") or "")
+ (self._read_file("package.json") or "")
)
if any(x in deps.lower() for x in ["nplusone", "bullet", "query-analyzer"]):
n_plus_one = True
results.append(
self._make_result(
"n_plus_one_detection",
pillar,
4,
n_plus_one,
"N+1 detection enabled" if n_plus_one else "No N+1 query detection",
)
)
return results
def _evaluate_build_system(self) -> list[CriterionResult]:
"""Evaluate Build System pillar."""
pillar = "Build System"
results = []
# L1: build_cmd_doc
readme = self._read_file("README.md") or ""
agents_md = self._read_file("AGENTS.md") or self._read_file("CLAUDE.md") or ""
build_doc = any(
x in (readme + agents_md).lower()
for x in [
"npm run",
"yarn",
"pnpm",
"make",
"cargo build",
"go build",
"pip install",
"python setup.py",
"gradle",
"mvn",
]
)
results.append(
self._make_result(
"build_cmd_doc",
pillar,
1,
build_doc,
"Build commands documented"
if build_doc
else "Build commands not documented",
)
)
# L1: deps_pinned
lockfile = self._file_exists(
"package-lock.json",
"yarn.lock",
"pnpm-lock.yaml",
"uv.lock",
"poetry.lock",
"Pipfile.lock",
"go.sum",
"Cargo.lock",
"Gemfile.lock",
)
results.append(
self._make_result(
"deps_pinned",
pillar,
1,
lockfile,
"Dependencies pinned with lockfile"
if lockfile
else "No lockfile found",
)
)
# L1: vcs_cli_tools
code, output = self._run_command(["gh", "auth", "status"])
vcs_cli = code == 0
if not vcs_cli:
code, output = self._run_command(["glab", "auth", "status"])
vcs_cli = code == 0
results.append(
self._make_result(
"vcs_cli_tools",
pillar,
1,
vcs_cli,
"VCS CLI authenticated" if vcs_cli else "VCS CLI not authenticated",
)
)
# L2: fast_ci_feedback
# Check for CI config existence as proxy
ci_exists = self._file_exists(
".github/workflows/*.yml",
".github/workflows/*.yaml",
".gitlab-ci.yml",
".circleci/config.yml",
"Jenkinsfile",
".travis.yml",
)
results.append(
self._make_result(
"fast_ci_feedback",
pillar,
2,
ci_exists,
"CI workflow configured" if ci_exists else "No CI configuration found",
)
)
# L2: single_command_setup
single_cmd = any(
x in (readme + agents_md).lower()
for x in [
"make install",
"npm install",
"yarn install",
"pip install -e",
"docker-compose up",
"./dev",
"make setup",
"just",
]
)
results.append(
self._make_result(
"single_command_setup",
pillar,
2,
single_cmd,
"Single command setup documented"
if single_cmd
else "No single command setup",
)
)
# L2: release_automation
release_auto = self._search_files(
".github/workflows/*.yml", r"(release|publish|deploy)"
) or self._search_files(".github/workflows/*.yaml", r"(release|publish|deploy)")
results.append(
self._make_result(
"release_automation",
pillar,
2,
release_auto,
"Release automation configured"
if release_auto
else "No release automation",
)
)
# L2: deployment_frequency (check for recent releases)
release_auto_exists = release_auto # Use same check as proxy
results.append(
self._make_result(
"deployment_frequency",
pillar,
2,
release_auto_exists,
"Regular deployments"
if release_auto_exists
else "Deployment frequency unclear",
)
)
# L3: release_notes_automation
release_notes = self._search_files(
".github/workflows/*.yml", r"(changelog|release.notes|latest.changes)"
)
results.append(
self._make_result(
"release_notes_automation",
pillar,
3,
release_notes,
"Release notes automated"
if release_notes
else "No release notes automation",
)
)
# L3: agentic_development
code, output = self._run_command(["git", "log", "--oneline", "-50"])
agentic = any(
x in output.lower()
for x in ["co-authored-by", "droid", "copilot", "claude", "gpt", "ai agent"]
)
results.append(
self._make_result(
"agentic_development",
pillar,
3,
agentic,
"AI agent commits found" if agentic else "No AI agent commits detected",
)
)
# L3: automated_pr_review
pr_review = self._file_exists("danger.js", "dangerfile.js", "dangerfile.ts")
if not pr_review:
workflows = list(self.repo_path.glob(".github/workflows/*.yml"))
for wf in workflows[:5]:
content = wf.read_text(errors="ignore")
if any(x in content.lower() for x in ["review", "danger", "lint-pr"]):
pr_review = True
break
results.append(
self._make_result(
"automated_pr_review",
pillar,
3,
pr_review,
"Automated PR review configured"
if pr_review
else "No automated PR review",
)
)
# L3: feature_flag_infrastructure
feature_flags = self._check_feature_flags()
results.append(
self._make_result(
"feature_flag_infrastructure",
pillar,
3,
feature_flags,
"Feature flags configured"
if feature_flags
else "No feature flag system",
)
)
# L4: build_performance_tracking
build_perf = False
if self._file_exists("turbo.json", "nx.json"):
build_perf = True
results.append(
self._make_result(
"build_performance_tracking",
pillar,
4,
build_perf,
"Build caching configured"
if build_perf
else "No build performance tracking",
)
)
# L4: heavy_dependency_detection (for JS bundles)
heavy_deps = False
pkg_json = self._read_file("package.json") or ""
if any(
x in pkg_json.lower()
for x in ["webpack-bundle-analyzer", "bundlesize", "size-limit"]
):
heavy_deps = True
results.append(
self._make_result(
"heavy_dependency_detection",
pillar,
4,
heavy_deps,
"Bundle size tracking configured"
if heavy_deps
else "No bundle size tracking",
)
)
# L4: unused_dependencies_detection
unused_deps = False
workflows = list(self.repo_path.glob(".github/workflows/*.yml"))
for wf in workflows[:5]:
content = wf.read_text(errors="ignore")
if any(x in content.lower() for x in ["depcheck", "deptry", "go mod tidy"]):
unused_deps = True
break
results.append(
self._make_result(
"unused_dependencies_detection",
pillar,
4,
unused_deps,
"Unused deps detection enabled"
if unused_deps
else "No unused deps detection",
)
)
# L4: dead_feature_flag_detection
dead_flags = False # Requires feature flag infra first
results.append(
self._make_result(
"dead_feature_flag_detection",
pillar,
4,
dead_flags,
"Dead flag detection enabled"
if dead_flags
else "No dead flag detection",
)
)
# L4: monorepo_tooling
monorepo_tools = self._file_exists(
"lerna.json", "nx.json", "turbo.json", "pnpm-workspace.yaml"
)
results.append(
self._make_result(
"monorepo_tooling",
pillar,
4,
monorepo_tools,
"Monorepo tooling configured"
if monorepo_tools
else "No monorepo tooling",
)
)
# L4: version_drift_detection
version_drift = False # Complex to detect
results.append(
self._make_result(
"version_drift_detection",
pillar,
4,
version_drift,
"Version drift detection enabled"
if version_drift
else "No version drift detection",
)
)
# L5: progressive_rollout
progressive = False
for pattern in ["*.yml", "*.yaml"]:
if self._search_files(
f".github/workflows/{pattern}", r"canary|gradual|rollout"
):
progressive = True
break
results.append(
self._make_result(
"progressive_rollout",
pillar,
5,
progressive,
"Progressive rollout configured"
if progressive
else "No progressive rollout",
)
)
# L5: rollback_automation
rollback = False
results.append(
self._make_result(
"rollback_automation",
pillar,
5,
rollback,
"Rollback automation configured"
if rollback
else "No rollback automation",
)
)
return results
def _evaluate_testing(self) -> list[CriterionResult]:
"""Evaluate Testing pillar."""
pillar = "Testing"
results = []
# L1: unit_tests_exist
tests_exist = self._file_exists(
"tests/**/*.py",
"test/**/*.py",
"*_test.py",
"*_test.go",
"**/*.spec.ts",
"**/*.spec.js",
"**/*.test.ts",
"**/*.test.js",
"spec/**/*.rb",
"tests/**/*.rs",
)
results.append(
self._make_result(
"unit_tests_exist",
pillar,
1,
tests_exist,
"Unit tests found" if tests_exist else "No unit tests found",
)
)
# L1: unit_tests_runnable
readme = self._read_file("README.md") or ""
agents_md = self._read_file("AGENTS.md") or self._read_file("CLAUDE.md") or ""
test_cmd = any(
x in (readme + agents_md).lower()
for x in [
"pytest",
"npm test",
"yarn test",
"go test",
"cargo test",
"make test",
"rake test",
"rspec",
"jest",
]
)
results.append(
self._make_result(
"unit_tests_runnable",
pillar,
1,
test_cmd,
"Test commands documented"
if test_cmd
else "Test commands not documented",
)
)
# L2: test_naming_conventions
naming = False
if self._file_exists("pyproject.toml"):
content = self._read_file("pyproject.toml") or ""
naming = "pytest" in content.lower()
if self._file_exists("jest.config.js", "jest.config.ts"):
naming = True
if "Go" in self.result.languages:
naming = True # Go has standard _test.go convention
results.append(
self._make_result(
"test_naming_conventions",
pillar,
2,
naming,
"Test naming conventions enforced"
if naming
else "No test naming conventions",
)
)
# L2: test_isolation
isolation = False
if self._file_exists("pyproject.toml"):
content = self._read_file("pyproject.toml") or ""
isolation = "pytest-xdist" in content or "-n auto" in content
workflows = list(self.repo_path.glob(".github/workflows/*.yml"))
for wf in workflows[:5]:
content = wf.read_text(errors="ignore")
if "matrix" in content.lower():
isolation = True
break
if "Go" in self.result.languages:
isolation = True # Go tests run in parallel by default
results.append(
self._make_result(
"test_isolation",
pillar,
2,
isolation,
"Tests support isolation/parallelism"
if isolation
else "No test isolation",
)
)
# L3: integration_tests_exist
integration = self._file_exists(
"tests/integration/**",
"integration/**",
"e2e/**",
"tests/e2e/**",
"cypress/**",
"playwright.config.*",
)
results.append(
self._make_result(
"integration_tests_exist",
pillar,
3,
integration,
"Integration tests found"
if integration
else "No integration tests found",
)
)
# L3: test_coverage_thresholds
coverage = False
for wf in workflows[:5]:
content = wf.read_text(errors="ignore")
if any(x in content.lower() for x in ["coverage", "codecov", "coveralls"]):
coverage = True
break
if self._file_exists(".coveragerc", "coverage.xml", "codecov.yml"):
coverage = True
results.append(
self._make_result(
"test_coverage_thresholds",
pillar,
3,
coverage,
"Coverage thresholds enforced"
if coverage
else "No coverage thresholds",
)
)
# L4: flaky_test_detection
flaky = False
for wf in workflows[:5]:
content = wf.read_text(errors="ignore")
if any(
x in content.lower() for x in ["retry", "flaky", "quarantine", "rerun"]
):
flaky = True
break
results.append(
self._make_result(
"flaky_test_detection",
pillar,
4,
flaky,
"Flaky test handling configured"
if flaky
else "No flaky test detection",
)
)
# L4: test_performance_tracking
test_perf = False
for wf in workflows[:5]:
content = wf.read_text(errors="ignore")
if any(x in content.lower() for x in ["durations", "timing", "benchmark"]):
test_perf = True
break
results.append(
self._make_result(
"test_performance_tracking",
pillar,
4,
test_perf,
"Test performance tracked"
if test_perf
else "No test performance tracking",
)
)
return results
def _evaluate_documentation(self) -> list[CriterionResult]:
"""Evaluate Documentation pillar."""
pillar = "Documentation"
results = []
# L1: readme
readme = self._file_exists("README.md", "README.rst", "README.txt", "README")
results.append(
self._make_result(
"readme",
pillar,
1,
readme,
"README exists" if readme else "No README found",
)
)
# L2: agents_md
agents_md = self._file_exists("AGENTS.md", "CLAUDE.md")
results.append(
self._make_result(
"agents_md",
pillar,
2,
agents_md,
"AGENTS.md exists" if agents_md else "No AGENTS.md found",
)
)
# L2: documentation_freshness
freshness = False
code, output = self._run_command(
["git", "log", "-1", "--format=%ci", "--", "README.md"]
)
if code == 0 and output.strip():
freshness = True
results.append(
self._make_result(
"documentation_freshness",
pillar,
2,
freshness,
"Documentation recently updated"
if freshness
else "Documentation may be stale",
)
)
# L3: api_schema_docs
api_docs = self._file_exists(
"openapi.yaml",
"openapi.json",
"swagger.yaml",
"swagger.json",
"schema.graphql",
"*.graphql",
"docs/api/**",
"api-docs/**",
)
results.append(
self._make_result(
"api_schema_docs",
pillar,
3,
api_docs,
"API documentation exists"
if api_docs
else "No API documentation found",
)
)
# L3: automated_doc_generation
doc_gen = self._search_files(
".github/workflows/*.yml", r"(docs|documentation|mkdocs|sphinx|typedoc)"
)
results.append(
self._make_result(
"automated_doc_generation",
pillar,
3,
doc_gen,
"Doc generation automated"
if doc_gen
else "No automated doc generation",
)
)
# L3: service_flow_documented
diagrams = self._file_exists(
"**/*.mermaid", "**/*.puml", "docs/architecture*", "docs/**/*.md"
)
results.append(
self._make_result(
"service_flow_documented",
pillar,
3,
diagrams,
"Architecture documented"
if diagrams
else "No architecture documentation",
)
)
# L3: skills
skills = self._file_exists(
".claude/skills/**", ".factory/skills/**", ".skills/**"
)
results.append(
self._make_result(
"skills",
pillar,
3,
skills,
"Skills directory exists" if skills else "No skills directory",
)
)
# L4: agents_md_validation
agents_validation = False
workflows = list(self.repo_path.glob(".github/workflows/*.yml"))
for wf in workflows[:5]:
content = wf.read_text(errors="ignore")
if any(x in content.lower() for x in ["agents.md", "claude.md"]):
agents_validation = True
break
results.append(
self._make_result(
"agents_md_validation",
pillar,
4,
agents_validation,
"AGENTS.md validation in CI"
if agents_validation
else "No AGENTS.md validation",
)
)
return results
def _evaluate_dev_environment(self) -> list[CriterionResult]:
"""Evaluate Dev Environment pillar."""
pillar = "Dev Environment"
results = []
# L2: env_template
env_template = self._file_exists(".env.example", ".env.template", ".env.sample")
if not env_template:
readme = self._read_file("README.md") or ""
agents_md = self._read_file("AGENTS.md") or ""
env_template = "environment variable" in (readme + agents_md).lower()
results.append(
self._make_result(
"env_template",
pillar,
2,
env_template,
"Environment template exists"
if env_template
else "No environment template",
)
)
# L3: devcontainer
devcontainer = self._file_exists(".devcontainer/devcontainer.json")
results.append(
self._make_result(
"devcontainer",
pillar,
3,
devcontainer,
"Devcontainer configured" if devcontainer else "No devcontainer found",
)
)
# L3: devcontainer_runnable
devcontainer_valid = False
if devcontainer:
content = self._read_file(".devcontainer/devcontainer.json")
if content and "image" in content.lower():
devcontainer_valid = True
results.append(
self._make_result(
"devcontainer_runnable",
pillar,
3,
devcontainer_valid,
"Devcontainer appears valid"
if devcontainer_valid
else "Devcontainer not runnable",
)
)
# L3: database_schema
db_schema = self._file_exists(
"migrations/**",
"db/migrations/**",
"alembic/**",
"prisma/schema.prisma",
"schema.sql",
"db/schema.rb",
)
results.append(
self._make_result(
"database_schema",
pillar,
3,
db_schema,
"Database schema managed"
if db_schema
else "No database schema management",
)
)
# L3: local_services_setup
local_services = self._file_exists(
"docker-compose.yml", "docker-compose.yaml", "compose.yml", "compose.yaml"
)
results.append(
self._make_result(
"local_services_setup",
pillar,
3,
local_services,
"Local services configured"
if local_services
else "No local services setup",
)
)
return results
def _evaluate_observability(self) -> list[CriterionResult]:
"""Evaluate Debugging & Observability pillar."""
pillar = "Debugging & Observability"
results = []
# L2: structured_logging
logging_found = False
deps = (
(self._read_file("package.json") or "")
+ (self._read_file("requirements.txt") or "")
+ (self._read_file("go.mod") or "")
)
if any(
x in deps.lower()
for x in [
"pino",
"winston",
"bunyan",
"structlog",
"loguru",
"zerolog",
"zap",
"slog",
]
):
logging_found = True
if "Python" in self.result.languages and self._search_files(
"**/*.py", r"import logging"
):
logging_found = True
results.append(
self._make_result(
"structured_logging",
pillar,
2,
logging_found,
"Structured logging configured"
if logging_found
else "No structured logging",
)
)
# L2: code_quality_metrics
quality_metrics = self._search_files(
".github/workflows/*.yml", r"(coverage|sonar|quality)"
)
results.append(
self._make_result(
"code_quality_metrics",
pillar,
2,
quality_metrics,
"Code quality metrics tracked"
if quality_metrics
else "No quality metrics",
)
)
# L3: error_tracking_contextualized
error_tracking = any(
x in deps.lower() for x in ["sentry", "bugsnag", "rollbar", "honeybadger"]
)
results.append(
self._make_result(
"error_tracking_contextualized",
pillar,
3,
error_tracking,
"Error tracking configured" if error_tracking else "No error tracking",
)
)
# L3: distributed_tracing
tracing = any(
x in deps.lower()
for x in ["opentelemetry", "jaeger", "zipkin", "datadog", "x-request-id"]
)
results.append(
self._make_result(
"distributed_tracing",
pillar,
3,
tracing,
"Distributed tracing configured"
if tracing
else "No distributed tracing",
)
)
# L3: metrics_collection
metrics = any(
x in deps.lower()
for x in ["prometheus", "datadog", "newrelic", "statsd", "cloudwatch"]
)
results.append(
self._make_result(
"metrics_collection",
pillar,
3,
metrics,
"Metrics collection configured" if metrics else "No metrics collection",
)
)
# L3: health_checks
health = (
self._search_files("**/*.py", r"health|ready|alive")
or self._search_files("**/*.ts", r"health|ready|alive")
or self._search_files("**/*.go", r"health|ready|alive")
)
results.append(
self._make_result(
"health_checks",
pillar,
3,
health,
"Health checks implemented" if health else "No health checks found",
)
)
# L4: profiling_instrumentation
profiling = any(
x in deps.lower() for x in ["pyinstrument", "py-spy", "pprof", "clinic"]
)
results.append(
self._make_result(
"profiling_instrumentation",
pillar,
4,
profiling,
"Profiling configured" if profiling else "No profiling instrumentation",
)
)
# L4: alerting_configured
alerting = self._file_exists(
"**/alerts*.yml", "**/alertmanager*", "monitoring/**"
)
results.append(
self._make_result(
"alerting_configured",
pillar,
4,
alerting,
"Alerting configured" if alerting else "No alerting configuration",
)
)
# L4: deployment_observability
deploy_obs = self._search_files(
".github/workflows/*.yml", r"(datadog|grafana|newrelic|deploy.*notify)"
)
results.append(
self._make_result(
"deployment_observability",
pillar,
4,
deploy_obs,
"Deployment observability configured"
if deploy_obs
else "No deployment observability",
)
)
# L4: runbooks_documented
runbooks = self._file_exists("runbooks/**", "docs/runbooks/**", "ops/**")
results.append(
self._make_result(
"runbooks_documented",
pillar,
4,
runbooks,
"Runbooks documented" if runbooks else "No runbooks found",
)
)
# L5: circuit_breakers
circuit = any(
x in deps.lower()
for x in ["opossum", "resilience4j", "hystrix", "cockatiel"]
)
results.append(
self._make_result(
"circuit_breakers",
pillar,
5,
circuit,
"Circuit breakers configured" if circuit else "No circuit breakers",
)
)
return results
def _evaluate_security(self) -> list[CriterionResult]:
"""Evaluate Security pillar."""
pillar = "Security"
results = []
# L1: gitignore_comprehensive
gitignore = self._file_exists(".gitignore")
comprehensive = False
if gitignore:
content = self._read_file(".gitignore") or ""
comprehensive = any(
x in content.lower()
for x in [".env", "node_modules", "__pycache__", ".idea", ".vscode"]
)
results.append(
self._make_result(
"gitignore_comprehensive",
pillar,
1,
comprehensive,
"Comprehensive .gitignore"
if comprehensive
else "Incomplete .gitignore",
)
)
# L2: secrets_management
secrets_mgmt = False
workflows = list(self.repo_path.glob(".github/workflows/*.yml"))
for wf in workflows[:5]:
content = wf.read_text(errors="ignore")
if "secrets." in content:
secrets_mgmt = True
break
results.append(
self._make_result(
"secrets_management",
pillar,
2,
secrets_mgmt,
"Secrets properly managed" if secrets_mgmt else "No secrets management",
)
)
# L2: codeowners
codeowners = self._file_exists("CODEOWNERS", ".github/CODEOWNERS")
results.append(
self._make_result(
"codeowners",
pillar,
2,
codeowners,
"CODEOWNERS configured" if codeowners else "No CODEOWNERS file",
)
)
# L2: branch_protection
branch_rules = self._file_exists(
".github/branch-protection.yml", ".github/rulesets/**"
)
results.append(
self._make_result(
"branch_protection",
pillar,
2,
branch_rules,
"Branch protection configured"
if branch_rules
else "Branch protection unclear",
)
)
# L3: dependency_update_automation
dep_updates = self._file_exists(
".github/dependabot.yml", "renovate.json", ".renovaterc"
)
results.append(
self._make_result(
"dependency_update_automation",
pillar,
3,
dep_updates,
"Dependency updates automated"
if dep_updates
else "No dependency automation",
)
)
# L3: log_scrubbing
log_scrub = False
deps = (self._read_file("package.json") or "") + (
self._read_file("requirements.txt") or ""
)
if any(x in deps.lower() for x in ["pino", "redact", "scrub"]):
log_scrub = True
results.append(
self._make_result(
"log_scrubbing",
pillar,
3,
log_scrub,
"Log scrubbing configured" if log_scrub else "No log scrubbing",
)
)
# L3: pii_handling
pii = self._search_files(
"**/*.py", r"(redact|sanitize|mask|pii)"
) or self._search_files("**/*.ts", r"(redact|sanitize|mask|pii)")
results.append(
self._make_result(
"pii_handling",
pillar,
3,
pii,
"PII handling implemented" if pii else "No PII handling found",
)
)
# L4: automated_security_review
security_scan = self._search_files(
".github/workflows/*.yml", r"(codeql|snyk|sonar|security)"
)
results.append(
self._make_result(
"automated_security_review",
pillar,
4,
security_scan,
"Security scanning enabled"
if security_scan
else "No security scanning",
)
)
# L4: secret_scanning
secret_scan = self._search_files(
".github/workflows/*.yml", r"(gitleaks|trufflehog|secret)"
)
results.append(
self._make_result(
"secret_scanning",
pillar,
4,
secret_scan,
"Secret scanning enabled" if secret_scan else "No secret scanning",
)
)
# L5: dast_scanning
dast = self._search_files(".github/workflows/*.yml", r"(zap|dast|owasp|burp)")
results.append(
self._make_result(
"dast_scanning",
pillar,
5,
dast,
"DAST scanning enabled" if dast else "No DAST scanning",
)
)
# L5: privacy_compliance
privacy = self._file_exists("PRIVACY.md", "docs/privacy/**", "gdpr/**")
results.append(
self._make_result(
"privacy_compliance",
pillar,
5,
privacy,
"Privacy compliance documented"
if privacy
else "No privacy documentation",
)
)
return results
def _evaluate_task_discovery(self) -> list[CriterionResult]:
"""Evaluate Task Discovery pillar."""
pillar = "Task Discovery"
results = []
# L2: issue_templates
issue_templates = self._file_exists(
".github/ISSUE_TEMPLATE/**", ".github/ISSUE_TEMPLATE.md"
)
results.append(
self._make_result(
"issue_templates",
pillar,
2,
issue_templates,
"Issue templates configured"
if issue_templates
else "No issue templates",
)
)
# L2: issue_labeling_system
labels = False
if issue_templates:
templates = list(self.repo_path.glob(".github/ISSUE_TEMPLATE/*.md"))
for t in templates[:5]:
content = t.read_text(errors="ignore")
if "labels:" in content.lower():
labels = True
break
results.append(
self._make_result(
"issue_labeling_system",
pillar,
2,
labels,
"Issue labels configured" if labels else "No issue labeling system",
)
)
# L2: pr_templates
pr_template = self._file_exists(
".github/pull_request_template.md",
".github/PULL_REQUEST_TEMPLATE.md",
"pull_request_template.md",
)
results.append(
self._make_result(
"pr_templates",
pillar,
2,
pr_template,
"PR template configured" if pr_template else "No PR template",
)
)
# L3: backlog_health
backlog = self._file_exists("CONTRIBUTING.md", ".github/CONTRIBUTING.md")
results.append(
self._make_result(
"backlog_health",
pillar,
3,
backlog,
"Contributing guidelines exist"
if backlog
else "No contributing guidelines",
)
)
return results
def _evaluate_product_analytics(self) -> list[CriterionResult]:
"""Evaluate Product & Analytics pillar."""
pillar = "Product & Analytics"
results = []
# L5: error_to_insight_pipeline
error_pipeline = False
workflows = list(self.repo_path.glob(".github/workflows/*.yml"))
for wf in workflows[:5]:
content = wf.read_text(errors="ignore")
if any(
x in content.lower()
for x in ["sentry", "create.*issue", "error.*issue"]
):
error_pipeline = True
break
# Also check for Sentry-GitHub integration
deps = (
(self._read_file("package.json") or "")
+ (self._read_file("requirements.txt") or "")
+ (self._read_file("go.mod") or "")
)
if "sentry" in deps.lower():
# Check for issue creation automation
for wf in workflows[:5]:
content = wf.read_text(errors="ignore")
if "issue" in content.lower() and "sentry" in content.lower():
error_pipeline = True
break
results.append(
self._make_result(
"error_to_insight_pipeline",
pillar,
5,
error_pipeline,
"Error-to-issue pipeline exists"
if error_pipeline
else "No error-to-issue pipeline",
)
)
# L5: product_analytics_instrumentation
analytics = False
if any(
x in deps.lower()
for x in [
"mixpanel",
"amplitude",
"posthog",
"heap",
"segment",
"ga4",
"google-analytics",
]
):
analytics = True
results.append(
self._make_result(
"product_analytics_instrumentation",
pillar,
5,
analytics,
"Product analytics configured" if analytics else "No product analytics",
)
)
return results
def _calculate_levels(self):
"""Calculate maturity level based on criteria pass rates."""
level_criteria: dict[int, list[CriterionResult]] = {i: [] for i in range(1, 6)}
for pillar in self.result.pillars.values():
for criterion in pillar.criteria:
if criterion.status != CriterionStatus.SKIP:
level_criteria[criterion.level].append(criterion)
for level in range(1, 6):
criteria = level_criteria[level]
if not criteria:
self.result.level_scores[level] = 100.0
continue
passed = sum(1 for c in criteria if c.status == CriterionStatus.PASS)
self.result.level_scores[level] = round((passed / len(criteria)) * 100, 1)
achieved = 0
for level in range(1, 6):
if self.result.level_scores[level] >= 80:
achieved = level
else:
break
# Only set achieved level if at least L1 is passed
self.result.achieved_level = achieved if achieved > 0 else 0
def main():
parser = argparse.ArgumentParser(
description="Analyze repository for agent readiness"
)
parser.add_argument(
"--repo-path", "-r", default=".", help="Path to the repository to analyze"
)
parser.add_argument(
"--output",
"-o",
default="/tmp/readiness_analysis.json",
help="Output file for analysis results",
)
parser.add_argument(
"--quiet", "-q", action="store_true", help="Suppress progress output"
)
args = parser.parse_args()
if not args.quiet:
print(f"🔍 Analyzing repository: {args.repo_path}")
analyzer = RepoAnalyzer(args.repo_path)
result = analyzer.analyze()
output = {
"repo_path": result.repo_path,
"repo_name": result.repo_name,
"repo_type": result.repo_type,
"languages": result.languages,
"pass_rate": result.pass_rate,
"total_passed": result.total_passed,
"total_criteria": result.total_criteria,
"achieved_level": result.achieved_level,
"level_scores": result.level_scores,
"pillars": {},
}
for pillar_name, pillar in result.pillars.items():
output["pillars"][pillar_name] = {
"name": pillar.name,
"passed": pillar.passed,
"total": pillar.total,
"percentage": pillar.percentage,
"criteria": [asdict(c) for c in pillar.criteria],
}
output_path = Path(args.output)
output_path.parent.mkdir(parents=True, exist_ok=True)
output_path.write_text(json.dumps(output, indent=2))
if not args.quiet:
print(
f"✅ Analysis complete: {result.total_passed}/{result.total_criteria} criteria passed ({result.pass_rate}%)"
)
print(f"📊 Achieved Level: L{result.achieved_level}")
print(f"📄 Results saved to: {args.output}")
return result
if __name__ == "__main__":
main()
```
### scripts/generate_report.py
```python
#!/usr/bin/env python3
"""Report Generator for Agent Readiness.
Generates formatted markdown reports from analysis JSON.
Usage:
python generate_report.py --analysis-file /tmp/readiness_analysis.json
python generate_report.py --analysis-file /tmp/readiness_analysis.json --format markdown
"""
import argparse
import json
from pathlib import Path
def format_level_bar(level_scores: dict, achieved: int) -> str:
"""Generate a visual level progress bar."""
bars = []
for level in range(1, 6):
score = level_scores.get(str(level), 0)
if level <= achieved:
indicator = "█" * 4
status = f"L{level} {score:.0f}%"
else:
indicator = "░" * 4
status = f"L{level} {score:.0f}%"
bars.append(f"{indicator} {status}")
return " | ".join(bars)
def format_criterion_row(criterion: dict) -> str:
"""Format a single criterion as a table row."""
status = criterion["status"]
crit_id = criterion["id"]
score = criterion["score"]
reason = criterion["reason"]
if status == "pass":
icon = "✓"
elif status == "fail":
icon = "✗"
else: # skip
icon = "—"
return f"{icon} `{crit_id}` | {score} | {reason}"
def get_top_strengths(data: dict, n: int = 3) -> list[tuple[str, int, list[str]]]:
"""Get top performing pillars with example passing criteria."""
pillar_scores = []
for pillar_name, pillar in data["pillars"].items():
if pillar["total"] > 0:
pct = pillar["percentage"]
passing = [c["id"] for c in pillar["criteria"] if c["status"] == "pass"][:3]
pillar_scores.append((pillar_name, pct, passing))
# Sort by percentage descending
pillar_scores.sort(key=lambda x: x[1], reverse=True)
return pillar_scores[:n]
def get_top_opportunities(data: dict, n: int = 5) -> list[tuple[str, str, str]]:
"""Get highest priority improvement opportunities."""
opportunities = []
# Prioritize by level (lower levels first), then by pillar importance
for pillar_name, pillar in data["pillars"].items():
for criterion in pillar["criteria"]:
if criterion["status"] == "fail":
opportunities.append(
(
criterion["id"],
criterion["level"],
criterion["reason"],
pillar_name,
)
)
# Sort by level (ascending) to prioritize foundational issues
opportunities.sort(key=lambda x: x[1])
return [(o[0], o[2], o[3]) for o in opportunities[:n]]
def generate_markdown_report(data: dict) -> str: # noqa: C901
"""Generate a full markdown report from analysis data."""
repo_name = data["repo_name"]
pass_rate = data["pass_rate"]
achieved = data["achieved_level"]
total_passed = data["total_passed"]
total = data["total_criteria"]
languages = data.get("languages", ["Unknown"])
repo_type = data.get("repo_type", "application")
level_scores = data["level_scores"]
lines = []
# Header
lines.append(f"# Agent Readiness Report: {repo_name}")
lines.append("")
lines.append(f"**Languages**: {', '.join(languages)} ")
lines.append(f"**Repository Type**: {repo_type} ")
lines.append(f"**Pass Rate**: {pass_rate}% ({total_passed}/{total} criteria) ")
if achieved > 0:
lines.append(f"**Achieved Level**: **L{achieved}**")
else:
lines.append("**Achieved Level**: **Not yet L1** (need 80% at L1)")
lines.append("")
# Level Progress
lines.append("## Level Progress")
lines.append("")
lines.append("| Level | Score | Status |")
lines.append("|-------|-------|--------|")
for level in range(1, 6):
score = level_scores.get(str(level), 0)
if achieved > 0 and level <= achieved:
status = "✅ Achieved"
elif score >= 80:
status = "✅ Passed"
else:
status = f"⬜ {100 - score:.0f}% to go"
lines.append(f"| L{level} | {score:.0f}% | {status} |")
lines.append("")
# Summary
lines.append("## Summary")
lines.append("")
# Strengths
strengths = get_top_strengths(data)
if strengths:
lines.append("### Strengths")
lines.append("")
for pillar_name, pct, passing in strengths:
if passing:
passing_str = ", ".join(f"`{p}`" for p in passing)
lines.append(f"- **{pillar_name}** ({pct}%): {passing_str}")
else:
lines.append(f"- **{pillar_name}** ({pct}%)")
lines.append("")
# Opportunities
opportunities = get_top_opportunities(data)
if opportunities:
lines.append("### Priority Improvements")
lines.append("")
lines.append("| Criterion | Issue | Pillar |")
lines.append("|-----------|-------|--------|")
for crit_id, reason, pillar in opportunities:
lines.append(f"| `{crit_id}` | {reason} | {pillar} |")
lines.append("")
# Detailed Results
lines.append("## Detailed Results")
lines.append("")
for pillar_name, pillar in data["pillars"].items():
pct = pillar["percentage"]
passed = pillar["passed"]
total = pillar["total"]
lines.append(f"### {pillar_name}")
lines.append(f"**Score**: {passed}/{total} ({pct}%)")
lines.append("")
lines.append("| Status | Criterion | Score | Details |")
lines.append("|--------|-----------|-------|---------|")
for criterion in pillar["criteria"]:
status = criterion["status"]
if status == "pass":
icon = "✓"
elif status == "fail":
icon = "✗"
else:
icon = "—"
crit_id = criterion["id"]
score = criterion["score"]
reason = criterion["reason"]
lines.append(f"| {icon} | `{crit_id}` | {score} | {reason} |")
lines.append("")
# Recommendations
lines.append("## Recommended Next Steps")
lines.append("")
if achieved < 2:
lines.append("**Focus on L1/L2 Foundations:**")
lines.append("1. Add missing linter and formatter configurations")
lines.append("2. Document build and test commands in README")
lines.append("3. Set up pre-commit hooks for fast feedback")
lines.append("4. Create AGENTS.md with project context for AI agents")
elif achieved < 3:
lines.append("**Progress to L3 (Production Ready):**")
lines.append("1. Add integration/E2E tests")
lines.append("2. Set up test coverage thresholds")
lines.append("3. Configure devcontainer for reproducible environments")
lines.append("4. Add automated PR review tooling")
else:
lines.append("**Optimize for L4+:**")
lines.append("1. Implement complexity analysis and dead code detection")
lines.append("2. Set up flaky test detection and quarantine")
lines.append("3. Add security scanning (CodeQL, Snyk)")
lines.append("4. Configure deployment observability")
lines.append("")
lines.append("---")
lines.append("*Report generated from repository analysis*")
return "\n".join(lines)
def generate_brief_report(data: dict) -> str:
"""Generate a brief summary report."""
repo_name = data["repo_name"]
pass_rate = data["pass_rate"]
achieved = data["achieved_level"]
total_passed = data["total_passed"]
total = data["total_criteria"]
lines = []
lines.append(f"## Agent Readiness: {repo_name}")
lines.append("")
level_str = f"Level {achieved}" if achieved > 0 else "Not yet L1"
lines.append(f"**{level_str}** | {pass_rate}% ({total_passed}/{total})")
lines.append("")
# Quick level summary
for level in range(1, 6):
score = data["level_scores"].get(str(level), 0)
bar = "█" * int(score / 10) + "░" * (10 - int(score / 10))
check = "✅" if achieved > 0 and level <= achieved else "⬜"
lines.append(f"L{level} {check} [{bar}] {score:.0f}%")
lines.append("")
# Top opportunities
opps = get_top_opportunities(data, 3)
if opps:
lines.append("**Quick Wins:**")
for crit_id, reason, _ in opps:
lines.append(f"- {crit_id}: {reason}")
return "\n".join(lines)
def main():
parser = argparse.ArgumentParser(
description="Generate Agent Readiness report from analysis"
)
parser.add_argument(
"--analysis-file",
"-a",
default="/tmp/readiness_analysis.json",
help="Path to analysis JSON file",
)
parser.add_argument("--output", "-o", help="Output file (default: stdout)")
parser.add_argument(
"--format",
"-f",
choices=["markdown", "brief", "json"],
default="markdown",
help="Output format",
)
args = parser.parse_args()
# Load analysis
analysis_path = Path(args.analysis_file)
if not analysis_path.exists():
print(f"❌ Analysis file not found: {args.analysis_file}")
print("Run analyze_repo.py first to generate the analysis.")
return 1
data = json.loads(analysis_path.read_text())
# Validate required fields
required_fields = [
"repo_name",
"pass_rate",
"achieved_level",
"total_passed",
"total_criteria",
"level_scores",
"pillars",
]
missing = [f for f in required_fields if f not in data]
if missing:
print(f"❌ Analysis data missing required fields: {', '.join(missing)}")
print("Re-run analyze_repo.py to generate a valid analysis.")
return 1
# Normalize level_scores keys to strings for consistent access
if "level_scores" in data:
data["level_scores"] = {str(k): v for k, v in data["level_scores"].items()}
# Generate report
if args.format == "markdown":
report = generate_markdown_report(data)
elif args.format == "brief":
report = generate_brief_report(data)
else: # json
report = json.dumps(data, indent=2)
# Output
if args.output:
Path(args.output).write_text(report)
print(f"✅ Report saved to: {args.output}")
else:
print(report)
return 0
if __name__ == "__main__":
exit(main())
```
### references/criteria.md
```markdown
# Agent Readiness Criteria
Complete reference of all 81 criteria evaluated across nine technical pillars.
## Criteria Format
Each criterion is binary (pass/fail) and includes:
- **ID**: Unique identifier (snake_case)
- **Level**: Maturity level (1-5) where criterion is evaluated
- **Detection**: How to check if the criterion is met
- **Impact**: What happens when this criterion fails
---
## 1. Style & Validation
Automated tools that catch bugs instantly. Without them, agents waste cycles on syntax errors and style drift.
| Criterion | Level | Detection | Impact |
|-----------|-------|-----------|--------|
| `formatter` | 1 | Prettier, Black, Ruff format, gofmt config exists | Agent submits code with formatting issues, waits for CI, fixes blindly |
| `lint_config` | 1 | ESLint, Ruff, golangci-lint, or language-specific linter configured | Style inconsistencies accumulate, harder to review |
| `type_check` | 1 | TypeScript strict, mypy, Go (static by default) | Type errors caught late in CI instead of immediately |
| `strict_typing` | 2 | TypeScript strict:true, mypy strict=true | Partial typing allows bugs to slip through |
| `pre_commit_hooks` | 2 | .pre-commit-config.yaml or Husky config | Checks run in CI instead of locally, slower feedback |
| `naming_consistency` | 2 | Linter rules or documented conventions | Inconsistent names make codebase harder to navigate |
| `large_file_detection` | 2 | Git LFS, pre-commit check-added-large-files | Large files bloat repository, slow clones |
| `code_modularization` | 3 | import-linter, Nx boundaries, Bazel modules | Architecture degrades over time |
| `cyclomatic_complexity` | 3 | gocyclo, lizard, radon, SonarQube | Complex functions harder to understand and modify |
| `dead_code_detection` | 3 | vulture, knip, deadcode in CI | Unused code clutters codebase |
| `duplicate_code_detection` | 3 | jscpd, PMD CPD, SonarQube | Duplicated code increases maintenance burden |
| `tech_debt_tracking` | 4 | TODO scanner, SonarQube, linter TODO rules | Tech debt accumulates without visibility |
| `n_plus_one_detection` | 4 | nplusone, bullet gem, query analyzer | Performance issues in database queries |
---
## 2. Build System
Fast, reliable builds enable rapid iteration. Slow CI kills agent productivity.
| Criterion | Level | Detection | Impact |
|-----------|-------|-----------|--------|
| `build_cmd_doc` | 1 | Build commands documented in README/AGENTS.md | Agent doesn't know how to build |
| `deps_pinned` | 1 | Lockfile exists (package-lock.json, uv.lock, go.sum) | Non-reproducible builds |
| `vcs_cli_tools` | 1 | gh/glab CLI authenticated | Can't interact with PRs/issues |
| `fast_ci_feedback` | 2 | CI completes in <10 minutes | Agent waits too long for feedback |
| `single_command_setup` | 2 | One command to set up dev environment | Agent can't bootstrap quickly |
| `release_automation` | 2 | Automated release workflow exists | Manual releases slow deployment |
| `deployment_frequency` | 2 | Regular releases (weekly+) | Infrequent deploys signal process issues |
| `release_notes_automation` | 3 | Auto-generated changelogs/release notes | Manual release notes are error-prone |
| `agentic_development` | 3 | AI agent commits visible in history | No prior agent integration |
| `automated_pr_review` | 3 | Danger.js, automated review bots | Reviews require human intervention |
| `feature_flag_infrastructure` | 3 | LaunchDarkly, Statsig, Unleash, custom system | Hard to ship incrementally |
| `build_performance_tracking` | 4 | Build caching, timing metrics | Build times creep up unnoticed |
| `heavy_dependency_detection` | 4 | Bundle size analysis (webpack-bundle-analyzer) | Bundle bloat goes unnoticed |
| `unused_dependencies_detection` | 4 | depcheck, deptry in CI | Bloated dependency tree |
| `dead_feature_flag_detection` | 4 | Stale flag detection tooling | Abandoned flags clutter code |
| `monorepo_tooling` | 4 | Nx, Turborepo, Bazel for monorepos | Cross-package changes are error-prone |
| `version_drift_detection` | 4 | Version consistency checks | Packages diverge silently |
| `progressive_rollout` | 5 | Canary deploys, gradual rollouts | All-or-nothing deployments are risky |
| `rollback_automation` | 5 | One-click rollback capability | Slow recovery from bad deploys |
---
## 3. Testing
Tests verify that changes work. Without them, agents can't validate their own work.
| Criterion | Level | Detection | Impact |
|-----------|-------|-----------|--------|
| `unit_tests_exist` | 1 | Test files present (*_test.*, *.spec.*) | No way to verify basic correctness |
| `unit_tests_runnable` | 1 | Test command documented and works | Agent can't run tests |
| `test_naming_conventions` | 2 | Consistent test file naming | Tests hard to find |
| `test_isolation` | 2 | Tests can run in parallel | Slow test runs |
| `integration_tests_exist` | 3 | E2E/integration test directory | Only unit-level coverage |
| `test_coverage_thresholds` | 3 | Coverage enforcement in CI | Coverage drifts down |
| `flaky_test_detection` | 4 | Test retry, quarantine, or tracking | Flaky tests erode trust |
| `test_performance_tracking` | 4 | Test timing metrics | Slow tests accumulate |
---
## 4. Documentation
Documentation tells the agent what it needs to know. Missing docs mean wasted exploration.
| Criterion | Level | Detection | Impact |
|-----------|-------|-----------|--------|
| `readme` | 1 | README.md exists with setup instructions | Agent doesn't know project basics |
| `agents_md` | 2 | AGENTS.md or CLAUDE.md exists | Agent lacks operational guidance |
| `documentation_freshness` | 2 | Key docs updated in last 180 days | Stale docs mislead agent |
| `api_schema_docs` | 3 | OpenAPI spec, GraphQL schema, or API docs | Agent must reverse-engineer APIs |
| `automated_doc_generation` | 3 | Doc generation in CI | Docs drift from code |
| `service_flow_documented` | 3 | Architecture diagrams (mermaid, PlantUML) | Agent lacks system context |
| `skills` | 3 | Skills directory (.claude/skills/, .factory/skills/) | No specialized agent instructions |
| `agents_md_validation` | 4 | CI validates AGENTS.md commands work | AGENTS.md becomes stale |
---
## 5. Dev Environment
Reproducible environments prevent "works on my machine" issues.
| Criterion | Level | Detection | Impact |
|-----------|-------|-----------|--------|
| `env_template` | 2 | .env.example or documented env vars | Agent guesses at configuration |
| `devcontainer` | 3 | .devcontainer/devcontainer.json exists | Environment setup is manual |
| `devcontainer_runnable` | 3 | Devcontainer builds and works | Devcontainer is broken |
| `database_schema` | 3 | Schema files or migration directory | Database structure undocumented |
| `local_services_setup` | 3 | docker-compose.yml for dependencies | External services need manual setup |
---
## 6. Debugging & Observability
When things go wrong, observability helps diagnose issues quickly.
| Criterion | Level | Detection | Impact |
|-----------|-------|-----------|--------|
| `structured_logging` | 2 | Logging library with structured output | Logs hard to parse |
| `code_quality_metrics` | 2 | Coverage reporting in CI | No visibility into code quality |
| `error_tracking_contextualized` | 3 | Sentry, Bugsnag with context | Errors lack context for debugging |
| `distributed_tracing` | 3 | OpenTelemetry, trace IDs | Can't trace requests across services |
| `metrics_collection` | 3 | Prometheus, Datadog, or custom metrics | No runtime visibility |
| `health_checks` | 3 | Health/readiness endpoints | Can't verify service status |
| `profiling_instrumentation` | 4 | CPU/memory profiling tools | Performance issues hard to diagnose |
| `alerting_configured` | 4 | PagerDuty, OpsGenie, alert rules | Issues discovered late |
| `deployment_observability` | 4 | Deploy tracking, dashboards | Can't correlate issues to deploys |
| `runbooks_documented` | 4 | Runbooks directory or linked docs | No guidance for incident response |
| `circuit_breakers` | 5 | Resilience patterns implemented | Cascading failures |
---
## 7. Security
Security criteria protect the codebase and data.
| Criterion | Level | Detection | Impact |
|-----------|-------|-----------|--------|
| `gitignore_comprehensive` | 1 | .gitignore excludes secrets, build artifacts | Sensitive files committed |
| `secrets_management` | 2 | GitHub secrets, vault, cloud secrets | Hardcoded secrets |
| `codeowners` | 2 | CODEOWNERS file exists | No clear ownership |
| `branch_protection` | 2 | Protected main branch | Unreviewed changes to main |
| `dependency_update_automation` | 3 | Dependabot, Renovate configured | Dependencies go stale |
| `log_scrubbing` | 3 | Log sanitization for PII | Sensitive data in logs |
| `pii_handling` | 3 | PII redaction mechanisms | PII exposure risk |
| `automated_security_review` | 4 | CodeQL, Snyk, SonarQube in CI | Security issues caught late |
| `secret_scanning` | 4 | GitHub secret scanning enabled | Leaked credentials |
| `dast_scanning` | 5 | Dynamic security testing | Runtime vulnerabilities missed |
| `privacy_compliance` | 5 | GDPR/privacy tooling | Compliance gaps |
---
## 8. Task Discovery
Structured task management helps agents find and understand work.
| Criterion | Level | Detection | Impact |
|-----------|-------|-----------|--------|
| `issue_templates` | 2 | .github/ISSUE_TEMPLATE/ exists | Inconsistent issue quality |
| `issue_labeling_system` | 2 | Consistent labels on issues | Issues hard to categorize |
| `pr_templates` | 2 | pull_request_template.md exists | PRs lack context |
| `backlog_health` | 3 | Issues have descriptive titles, labels | Unclear what to work on |
---
## 9. Product & Analytics
Connect errors to insights and understand user behavior.
| Criterion | Level | Detection | Impact |
|-----------|-------|-----------|--------|
| `error_to_insight_pipeline` | 5 | Sentry-GitHub issue creation automation | Errors don't become actionable issues |
| `product_analytics_instrumentation` | 5 | Mixpanel, Amplitude, PostHog, Heap | No user behavior data to inform decisions |
---
## Skipped Criteria
Some criteria are skipped based on repository type:
- **Libraries**: Skip deployment-related criteria (progressive_rollout, health_checks)
- **CLI tools**: Skip web-specific criteria (dast_scanning)
- **Database projects**: Skip N+1 detection (they ARE the database)
- **Single apps**: Skip monorepo tooling criteria
The analysis script automatically determines which criteria to skip based on detected repository characteristics.
```
### references/maturity-levels.md
```markdown
# Maturity Levels
Repositories progress through five levels. Each level represents a qualitative shift in what autonomous agents can accomplish.
## Level Progression Rules
1. **80% Threshold**: To unlock a level, pass ≥80% of criteria at that level
2. **Cumulative**: Must also pass all previous levels at 80%+
3. **Binary Criteria**: Each criterion is pass/fail (no partial credit)
4. **Skipped Criteria**: Not applicable criteria are excluded from the calculation
---
## Level 1: Initial
**Status**: Basic version control
**Agent Capability**: Manual assistance only
### Requirements
The bare minimum for any collaborative development.
| Criterion | Pillar |
|-----------|--------|
| `formatter` | Style & Validation |
| `lint_config` | Style & Validation |
| `type_check` | Style & Validation |
| `build_cmd_doc` | Build System |
| `deps_pinned` | Build System |
| `vcs_cli_tools` | Build System |
| `unit_tests_exist` | Testing |
| `unit_tests_runnable` | Testing |
| `readme` | Documentation |
| `gitignore_comprehensive` | Security |
### What Agents Can Do
- Read and understand code
- Make simple edits with manual verification
- Cannot reliably validate their own changes
### What's Missing
Without L1, agents operate blind—no way to verify builds, run tests, or ensure code quality.
---
## Level 2: Managed
**Status**: Basic CI/CD and testing
**Agent Capability**: Simple, well-defined tasks
### Requirements
Foundational automation and documentation.
| Criterion | Pillar |
|-----------|--------|
| `strict_typing` | Style & Validation |
| `pre_commit_hooks` | Style & Validation |
| `naming_consistency` | Style & Validation |
| `large_file_detection` | Style & Validation |
| `fast_ci_feedback` | Build System |
| `single_command_setup` | Build System |
| `release_automation` | Build System |
| `deployment_frequency` | Build System |
| `test_naming_conventions` | Testing |
| `test_isolation` | Testing |
| `agents_md` | Documentation |
| `documentation_freshness` | Documentation |
| `env_template` | Dev Environment |
| `structured_logging` | Observability |
| `code_quality_metrics` | Observability |
| `secrets_management` | Security |
| `codeowners` | Security |
| `branch_protection` | Security |
| `issue_templates` | Task Discovery |
| `issue_labeling_system` | Task Discovery |
| `pr_templates` | Task Discovery |
### What Agents Can Do
- Fix simple, well-scoped bugs
- Make straightforward changes with fast feedback
- Run tests and verify basic correctness
### What's Missing
Without L2, feedback loops are too slow. Agents wait minutes for CI instead of seconds for local checks.
---
## Level 3: Standardized
**Status**: Production-ready for agents
**Agent Capability**: Routine maintenance
### Requirements
Clear processes defined and enforced. This is the target for most teams.
| Criterion | Pillar |
|-----------|--------|
| `code_modularization` | Style & Validation |
| `cyclomatic_complexity` | Style & Validation |
| `dead_code_detection` | Style & Validation |
| `duplicate_code_detection` | Style & Validation |
| `release_notes_automation` | Build System |
| `agentic_development` | Build System |
| `automated_pr_review` | Build System |
| `feature_flag_infrastructure` | Build System |
| `integration_tests_exist` | Testing |
| `test_coverage_thresholds` | Testing |
| `api_schema_docs` | Documentation |
| `automated_doc_generation` | Documentation |
| `service_flow_documented` | Documentation |
| `skills` | Documentation |
| `devcontainer` | Dev Environment |
| `devcontainer_runnable` | Dev Environment |
| `database_schema` | Dev Environment |
| `local_services_setup` | Dev Environment |
| `error_tracking_contextualized` | Observability |
| `distributed_tracing` | Observability |
| `metrics_collection` | Observability |
| `health_checks` | Observability |
| `dependency_update_automation` | Security |
| `log_scrubbing` | Security |
| `pii_handling` | Security |
| `backlog_health` | Task Discovery |
### What Agents Can Do
- Bug fixes and routine maintenance
- Test additions and documentation updates
- Dependency upgrades with confidence
- Feature work with clear specifications
### Example Repositories at L3
- FastAPI
- GitHub CLI
- pytest
---
## Level 4: Measured
**Status**: Comprehensive automation
**Agent Capability**: Complex features
### Requirements
Advanced tooling and metrics for optimization.
| Criterion | Pillar |
|-----------|--------|
| `tech_debt_tracking` | Style & Validation |
| `n_plus_one_detection` | Style & Validation |
| `build_performance_tracking` | Build System |
| `unused_dependencies_detection` | Build System |
| `dead_feature_flag_detection` | Build System |
| `monorepo_tooling` | Build System |
| `version_drift_detection` | Build System |
| `flaky_test_detection` | Testing |
| `test_performance_tracking` | Testing |
| `agents_md_validation` | Documentation |
| `profiling_instrumentation` | Observability |
| `alerting_configured` | Observability |
| `deployment_observability` | Observability |
| `runbooks_documented` | Observability |
| `automated_security_review` | Security |
| `secret_scanning` | Security |
### What Agents Can Do
- Complex multi-file refactors
- Performance optimization with data
- Architecture improvements
- Security hardening
### Example Repositories at L4
- CockroachDB
- Temporal
---
## Level 5: Optimized
**Status**: Full autonomous capability
**Agent Capability**: End-to-end development
### Requirements
Comprehensive observability, security, and automation.
| Criterion | Pillar |
|-----------|--------|
| `progressive_rollout` | Build System |
| `rollback_automation` | Build System |
| `circuit_breakers` | Observability |
| `dast_scanning` | Security |
| `privacy_compliance` | Security |
| `error_to_insight_pipeline` | Task Discovery |
| `product_analytics_instrumentation` | Task Discovery |
### What Agents Can Do
- Full feature development with minimal oversight
- Incident response and remediation
- Autonomous triage and prioritization
- Continuous improvement of the codebase itself
### What L5 Looks Like
Very few repositories achieve L5. It requires mature DevOps, comprehensive observability, and sophisticated automation.
---
## Calculating Scores
### Repository Score
```
Pass Rate = (Passed Criteria) / (Total Applicable Criteria) × 100%
```
### Level Determination
```
For each level L (1 to 5):
If (pass rate at L ≥ 80%) AND (all lower levels ≥ 80%):
Repository is at level L
```
### Organization Score
```
Org Level = floor(average of all repository levels)
Key Metric = % of active repos at L3+
```
---
## Priority Order for Improvement
1. **Fix L1 failures first**: These block basic operation
2. **Then L2**: Enable fast feedback loops
3. **Aim for L3**: Production-ready target
4. **L4+ is optimization**: Nice to have, not essential
### Quick Wins by Level
**L1 → L2**:
- Add pre-commit hooks
- Document setup commands
- Add AGENTS.md
**L2 → L3**:
- Add integration tests
- Set up devcontainer
- Configure automated reviews
**L3 → L4**:
- Add complexity analysis
- Set up flaky test detection
- Enable security scanning
```