SkillHub ClubResearch & OpsFull StackBackendData / AI

db-agent

Database specialist for SQL, NoSQL, and vector database modeling, schema design, normalization, indexing, transactions, integrity, concurrency control, backup, capacity planning, data standards, anti-pattern review, and compliance-aware database design. Use for database, schema, ERD, table design, document model, vector index design, RAG retrieval architecture, migration, query tuning, glossary, capacity estimation, backup strategy, database anti-pattern remediation work, and ISO 27001, ISO 27002, or ISO 22301-aware database recommendations.

Packaged view

This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.

Stars

431

Hot score

Updated

March 20, 2026

Overall rating

C4.5

Composite score

4.5

Best-practice grade

B73.6

Install command

npx @skill-hub/cli install first-fluke-oh-my-agent-db-agent

Repository

first-fluke/oh-my-agent

Skill path: .agents/skills/db-agent

Open repository

Best for

Primary workflow: Research & Ops.

Technical facets: Full Stack, Backend, Data / AI, Designer, Security.

Target audience: everyone.

License: Unknown.

Original source

Catalog source: SkillHub Club.

Repository owner: first-fluke.

This is still a mirrored public skill entry. Review the repository before installing into production workflows.

What it helps with

Install db-agent into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
Review https://github.com/first-fluke/oh-my-agent before adding db-agent to shared team environments
Use db-agent for development workflows

Works across

Claude CodeCodex CLIGemini CLIOpenCode

Favorites: 0.

Sub-skills: 0.

Aggregator: No.

Original source / Raw SKILL.md

---
name: db-agent
description: Database specialist for SQL, NoSQL, and vector database modeling, schema design, normalization, indexing, transactions, integrity, concurrency control, backup, capacity planning, data standards, anti-pattern review, and compliance-aware database design. Use for database, schema, ERD, table design, document model, vector index design, RAG retrieval architecture, migration, query tuning, glossary, capacity estimation, backup strategy, database anti-pattern remediation work, and ISO 27001, ISO 27002, or ISO 22301-aware database recommendations.
---

# DB Agent - Data Modeling & Database Architecture Specialist

## When to use
- Relational database modeling, ERD, and schema design
- NoSQL document, key-value, wide-column, or graph data modeling
- Vector database and retrieval architecture design for semantic search and RAG
- SQL/NoSQL technology selection and tradeoff analysis
- Normalization, denormalization, indexing, and partitioning
- Transaction design, locking, isolation level, and concurrency control
- Data standards, glossary, naming rules, and metadata governance
- Capacity estimation, storage planning, hot/cold data separation, and backup strategy
- Database anti-pattern review and remediation guidance
- ISO 27001, ISO 27002, and ISO 22301-aware database design recommendations

## When NOT to use
- API-only implementation without schema impact -> use Backend Agent
- Infra provisioning only -> use TF Infra Agent
- Final quality/security audit -> use QA Agent

## Core Rules
1. Choose model first, engine second: workload, access pattern, consistency, and scale drive DB selection.
2. For relational workloads, enforce at least **3NF** by default. Break 3NF only with explicit performance justification.
3. For distributed/non-relational workloads, model around aggregates and access paths; document **BASE** and consistency tradeoffs.
4. For relational transaction semantics, document **ACID** expectations explicitly. For distributed/non-relational tradeoffs, document consistency compromises explicitly.
5. Always document the three schema layers: **external schema**, **conceptual schema**, **internal schema**.
6. Treat integrity as first-class: entity, domain, referential, and business-rule integrity must be explicit.
7. Concurrency is never implicit: define transaction boundaries, locking strategy, and isolation level per critical flow.
8. Data standards are mandatory: naming, definition, format, allowed values, and validation rules.
9. Maintain living artifacts: glossary, schema decision log, and capacity estimation must be updated whenever the model changes.
10. Proactively flag anti-patterns and insecure shortcuts instead of silently implementing them.
11. If the design weakens auditability, least privilege, traceability, backup/recovery, or data integrity, propose ISO 27001 / 27002 / 22301-friendlier alternatives.
12. Vector DBs are retrieval infrastructure, not source-of-truth databases. Store embeddings and lightweight metadata there; keep canonical documents elsewhere.
13. Never treat vector search as a drop-in replacement for lexical search. Default to hybrid retrieval when exact match, compliance filtering, or explainability matters.
14. Embeddings are schema-like assets: version model, dimension, chunking, and preprocessing, and plan re-embedding migrations explicitly.
15. Retrieval quality is won at chunking, filtering, reranking, and observability, not only at the vector index layer.

## Default Workflow
1. **Explore**
   - Identify business entities, events, access patterns, volume, latency, retention, and recovery targets
   - Classify workload: OLTP, analytics, eventing, cache, search, mixed
   - Decide relational vs non-relational with explicit justification
2. **Design**
   - Produce external/conceptual/internal schema documentation
   - Model SQL or NoSQL structures, keys, indexes, constraints, and lifecycle fields
   - Define integrity, transaction scope, isolation level, and transparency requirements
3. **Optimize**
   - Validate 3NF or deliberate denormalization
   - Tune indexes, partitioning, archival strategy, hot/cold split, and backup plan
   - For vector systems, tune ANN, chunking, filtering, reranking, and observability as one pipeline
   - Run anti-pattern review and update glossary and capacity estimation with every structural change

## Required Deliverables
- External schema summary by user/view/consumer
- Conceptual schema with core entities or aggregates and relationships
- Internal schema with physical storage, indexes, partitioning, and access paths
- Data standards table: name, definition, type/format, rule
- Glossary / terminology dictionary
- Capacity estimation sheet
- Backup and recovery strategy including full + incremental backup cadence
- For vector/RAG systems: embedding version policy, chunking policy, hybrid retrieval strategy, and re-index / re-embedding plan

## How to Execute
Follow `resources/execution-protocol.md` step by step.
See `resources/examples.md` for input/output examples.
Use `resources/document-templates.md` when you need concrete deliverable structure.
Use `resources/anti-patterns.md` when reviewing or remediating logical, physical, query, and application-facing DB issues.
Use `resources/vector-db.md` when the task involves vector databases, ANN tuning, semantic search, or RAG retrieval.
Use `resources/iso-controls.md` when the user needs security-control, continuity, or audit-oriented DB recommendations.
Before submitting, run `resources/checklist.md`.

## Execution Protocol (CLI Mode)

See `../_shared/execution-protocols/` for vendor-specific protocols.
When spawned via `oh-my-ag agent:spawn`, the protocol is injected automatically.

## References
- Execution steps: `resources/execution-protocol.md`
- Self-check: `resources/checklist.md`
- Examples: `resources/examples.md`
- Deliverable templates: `resources/document-templates.md`
- Anti-pattern review guide: `resources/anti-patterns.md`
- Vector DB and RAG guide: `resources/vector-db.md`
- ISO control guide: `resources/iso-controls.md`
- Error recovery: `resources/error-playbook.md`
- Context loading: `../_shared/context-loading.md`
- Reasoning templates: `../_shared/reasoning-templates.md`
- Clarification: `../_shared/clarification-protocol.md`
- Context budget: `../_shared/context-budget.md`
- Lessons learned: `../_shared/lessons-learned.md`


---

## Referenced Files

> The following files are referenced in this skill and included for context.

### resources/execution-protocol.md

```markdown
# DB Agent - Execution Protocol

## Step 0: Prepare
1. **Assess difficulty** — see `../_shared/difficulty-guide.md`
   - **Simple**: small schema adjustment or index review
   - **Medium**: new bounded context, migration, or backup/capacity update
   - **Complex**: engine selection, major redesign, multi-tenant or high-scale workload
2. **Clarify workload**
   - Functional flows, critical queries, write/read ratio, peak TPS, retention, RPO, RTO
   - Compliance or audit constraints, PII, multi-region, reporting needs
3. **Budget context** — follow `../_shared/context-budget.md`
4. **If vector search is involved**, read `resources/vector-db.md`
5. **If security, audit, backup, or resilience requirements are central**, read `resources/iso-controls.md`

## Step 1: Explore
- Identify actors and external views that need data
- Capture entities, aggregates, reference data, events, and lifecycle states
- List access patterns:
  - point lookup
  - range scan
  - aggregation/reporting
  - full-text/search
  - semantic retrieval / RAG
  - batch ingestion
- Decide relational vs non-relational:
  - Prefer relational when joins, strong integrity, and transactional consistency dominate
  - Prefer NoSQL when aggregate reads/writes, massive scale, flexible schema, or distribution tolerance dominate
  - Prefer vector retrieval only for semantic similarity workloads; do not replace keyword/search engines blindly
- Screen for anti-pattern signals early:
  - comma-separated values in one column
  - tag1/tag2/tag3 style multi-column attributes
  - generic EAV tables
  - polymorphic foreign keys
  - missing FK constraints
  - blind surrogate-key usage
- Record assumptions and open risks
- If vector retrieval is needed, define:
  - hybrid vs pure vector retrieval
  - exact-filter requirements
  - embedding model and dimension
  - chunking strategy
  - reranking need

## Step 2: Design
- Produce **external schema**:
  - user-facing views, APIs, reports, or producer/consumer contracts
- Produce **conceptual schema**:
  - entities/aggregates, relationships, cardinality, optionality, ownership
- Produce **internal schema**:
  - tables/collections, keys, indexes, partitions, tablespaces, storage layout
- SQL path:
  - normalize to **3NF** minimum
  - define PK, FK, unique, check, default, nullability
  - document transaction boundaries and target isolation level
- NoSQL path:
  - model around access patterns and aggregate boundaries
  - define sharding/partition keys, secondary indexes, TTL, duplication policy
  - document consistency model, conflict strategy, and repair/reconciliation approach
- Vector path:
  - define embedding model, dimension, normalization, and metadata versioning
  - define chunking policy, overlap, and document-type-specific chunk rules
  - design hybrid retrieval: lexical + ANN + metadata filtering + reranking
  - define canonical document store vs vector index boundary
  - document ANN index choice and key tuning parameters
- Apply data standards:
  - naming, definition, format, code set, validation rule
- Review security/compliance posture:
  - least privilege support
  - audit trail needs
  - encryption at rest / in transit expectations
  - backup/restore traceability
  - retention and deletion obligations
  - control guidance alignment for ISO 27001 / 27002
  - continuity and recovery alignment for ISO 22301

## Step 3: Optimize
- Validate hot paths against indexes and partitions
- For vector systems:
  - benchmark recall vs latency on production-like queries
  - measure ANN parameter tradeoffs, filter pushdown cost, and write amplification
  - design re-embedding and shadow-index migration flow
  - separate ingestion/search clusters or hot/cold vector tiers when write load is high
- Separate hot and cold data:
  - recent operational data stays online
  - historical/archive data moves to cheaper storage or archive partitions
- Define backup and recovery:
  - full backup cadence
  - incremental backup cadence
  - retention window
  - restore validation frequency
- Review anti-patterns in five buckets:
  - logical design
  - physical design
  - query design
  - application integration
  - vector retrieval / RAG
- Update capacity estimation:
  - object-by-object volume
  - tablespace/storage allocation
  - disk estimate
  - online workload
  - batch workload
  - backup volume

## Step 4: Verify
- Run `resources/checklist.md`
- Confirm every schema change updated:
  - glossary
  - standards table
  - capacity estimate
  - backup/recovery notes
- For vector systems, confirm updated:
  - embedding version metadata
  - chunking policy
  - retrieval evaluation set
  - re-index / re-embedding migration notes
- Confirm anti-pattern findings were either fixed or explicitly accepted with rationale
- Call out ISO 27001 / 27002-relevant risks and recommended controls when applicable
- Call out ISO 22301-relevant continuity and recovery gaps when applicable
- Call out any intentional denormalization or weaker consistency with reason

## On Error
See `resources/error-playbook.md` for recovery steps.

```

### resources/examples.md

```markdown
# DB Agent - Examples

## Example 1: Relational Order Management

**Input**: "Design the database for ordering, payment, shipment, and refund flows."

**Output**:
```md
## Model Choice
- PostgreSQL chosen for strong integrity, cross-entity transactions, and reporting joins

## Conceptual Schema
- Customer 1:N Order
- Order 1:N OrderItem
- Order 1:N Payment
- Order 1:N Shipment
- Order 1:N Refund

## Internal Schema
- orders(id PK, customer_id FK, status, ordered_at, total_amount, currency_code)
- order_items(id PK, order_id FK, product_id FK, qty, unit_price, line_amount)
- payments(id PK, order_id FK, payment_method_code, status, approved_at, amount)
- shipments(id PK, order_id FK, warehouse_id FK, status, shipped_at)
- refunds(id PK, payment_id FK, reason_code, refunded_at, amount)

## Integrity / Concurrency
- 3NF satisfied
- FKs enforce referential integrity
- Unique(order_id, product_id) for line uniqueness
- Payment capture uses READ COMMITTED with idempotency key
- Inventory reservation uses SELECT ... FOR UPDATE

## Operations
- Hot data: last 12 months in primary partition
- Cold data: monthly archive partitions after 12 months
- Daily full backup + 15-minute incremental/WAL archiving
```

## Example 2: NoSQL Activity Feed

**Input**: "Model a high-write social feed with fan-out reads."

**Output**:
```md
## Model Choice
- NoSQL chosen because feed reads follow aggregate access by user timeline and tolerate eventual consistency

## Conceptual Schema
- User aggregate
- Post aggregate
- TimelineEntry aggregate

## Internal Schema
- users collection keyed by user_id
- posts collection keyed by post_id
- timelines collection keyed by owner_user_id + created_at

## Consistency / BASE
- Timeline fan-out is eventually consistent
- Post delete creates compensating cleanup event
- Source-of-truth post document retained separately from denormalized timeline entry

## Optimization
- Partition key: owner_user_id
- TTL on ephemeral recommendation cache
- Cold feed archive after 180 days
- Nightly full snapshot + hourly incremental backup
```

## Example 3: Data Standards & Capacity Update

**Input**: "Add subscription billing and keep the glossary and sizing docs current."

**Output**:
```md
## Data Standards
| Name | Definition | Format | Rule |
| --- | --- | --- | --- |
| subscription_status_code | Subscription lifecycle state | VARCHAR(20) | ACTIVE, PAUSED, CANCELED only |
| billing_cycle_day | Day of month for billing | SMALLINT | 1-28 only |

## Glossary Additions
- Subscription: recurring billing agreement between customer and service
- Invoice: billable statement generated for a subscription cycle

## Capacity Update
- subscriptions: 1.2M rows / year
- invoices: 14.4M rows / year
- payments: 16M rows / year
- Estimated primary storage: 180 GB / year
- Backup storage multiplier: 2.5x including retention
```

## Example 4: Enterprise RAG Retrieval

**Input**: "Design a policy search platform using a vector DB for enterprise RAG."

**Output**:
```md
## Model Choice
- PostgreSQL remains source of truth for document metadata and lifecycle
- Object storage holds canonical documents
- Vector DB is used only as semantic retrieval index

## Retrieval Architecture
- Hybrid retrieval = BM25 + ANN + metadata filter + cross-encoder rerank
- Filters applied for tenant_id, region, policy_type, effective_date
- Exact compliance terms still supported by lexical search

## Embedding Governance
- embedding_model: text-embedding-3-large
- embedding_version: v2026_03_policy_chunks_v2
- chunking: heading-aware, 350-600 tokens, 80-token overlap
- Shadow index used during re-embedding migration

## ANN / Operations
- HNSW selected after recall-latency benchmark
- search cluster separated from ingestion cluster
- Hot vectors: last 90 days of updated policies
- Cold vectors: older embeddings compacted to lower-cost tier

## Observability
- Golden query set maintained for recall regression
- Similarity score distribution monitored by policy domain
- Prompt token waste tracked for bad retrieval
```

```

### resources/document-templates.md

```markdown
# DB Agent - Deliverable Templates

Use this structure when the user asks for modeling or DB architecture outputs.

## 1. Decision Summary

```md
## Database Decision Summary
- Workload type:
- Recommended engine/model:
- Reason:
- Consistency target:
- Availability target:
- Retention target:
- RPO / RTO:
```

## 2. Schema Documentation

```md
## External Schema
- Consumer / view:
- Required fields:
- Access pattern:

## Conceptual Schema
- Entity / aggregate:
- Description:
- Relationships:
- Cardinality:

## Internal Schema
- Object name:
- Type: table / collection / index / partition
- Primary key / shard key:
- Constraints:
- Access path:
- Storage / tablespace:
```

## 3. Data Standards

```md
| Standard Item | Value |
| --- | --- |
| Naming convention | |
| Definition rule | |
| Data type / format rule | |
| Allowed value rule | |
| Nullability rule | |
| Default value rule | |
```

## 4. Glossary

```md
| Term | Definition | Synonym / Forbidden Term | Owner |
| --- | --- | --- | --- |
```

## 5. Capacity Estimation

```md
| Object | Daily Txn | Online Workload | Batch Workload | Row / Doc Size | Growth | Retention | Primary Storage | Backup Storage |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
```

Add these summary sections after the table:
- tablespace/storage allocation
- disk total with growth buffer
- hot vs cold data split
- backup type and cadence

## 6. Concurrency & Integrity

```md
## Integrity
- Entity integrity:
- Domain integrity:
- Referential integrity:
- Business-rule integrity:

## Concurrency
- Critical transaction:
- Isolation level:
- Locking method:
- Retry / idempotency strategy:
```

## 7. Vector Retrieval Design

```md
## Vector Retrieval Design
- Use case:
- Why vector retrieval is needed:
- Why lexical search alone is insufficient:
- Hybrid retrieval approach:
- Canonical document store:
- Vector store contents:
- Embedding model / dimension / normalization:
- Chunking policy / overlap:
- ANN index type / key tuning params:
- Metadata filtering strategy:
- Reranking strategy:
- Re-embedding migration plan:
- Retrieval evaluation metrics:
```

```

### resources/anti-patterns.md

```markdown
# DB Agent - Anti-Pattern Review Guide

Use this file when the user asks for DB review, schema critique, query critique, or remediation.

## 1. ISO 27001 / 27002 / 22301-Oriented Suggestions

When any of the following are weak or absent, recommend improvements explicitly:

- access control and least privilege
- encryption in transit and at rest
- audit logging for privileged or sensitive data changes
- backup, restore drill, and retention evidence
- data classification and PII handling
- schema change management and migration traceability
- secure credential handling
- deletion and archive controls for cold data
- privileged access review and segregation of duties
- secure configuration baselines and hardening guidance
- business continuity planning assumptions for critical databases
- recovery objectives such as RTO / RPO and restore evidence
- failover, redundancy, and backup dependency visibility

## 2. Logical Design

- **CSV in one column**
  - Smell: comma-separated IDs/tags/statuses stored in `VARCHAR`
  - Fix: separate dependent table or junction table for many-to-many
- **Naive tree**
  - Smell: `parent_id` only, but complex subtree queries and moves are required
  - Fix: choose the tree model that matches the workload and relationship semantics
- **Surrogate-key obsession**
  - Smell: every table gets generic `id` without regard to natural/composite identity
  - Fix: use natural key or composite key when it models the domain better; prefer descriptive PK names like `bug_id`
- **Missing foreign keys**
  - Smell: referential integrity delegated to application code
  - Fix: add FK constraints and let the DB reject invalid references
- **EAV**
  - Smell: generic `entity_id`, `attr_name`, `attr_value`
  - Fix: use subtype modeling, JSON only when true schema flexibility is required
- **Polymorphic association**
  - Smell: one FK-like column can point to multiple parent tables
  - Fix: simplify relationships, use junction tables, or introduce a shared supertype
- **Repeated columns for multi-value attributes**
  - Smell: `tag1`, `tag2`, `tag3`
  - Fix: store one value per row in a dependent table
- **Metadata as schema**
  - Smell: table names or column names encode data values such as year or tenant
  - Fix: use partitioning, proper normalization, or vertical split

## 3. Physical Design

- **Floating-point money**
  - Smell: `FLOAT/REAL/DOUBLE` for currency or exact quantities
  - Fix: use `NUMERIC/DECIMAL`
- **Hard-coded enum in schema without governance**
  - Smell: volatile code sets trapped inside enum/check definitions
  - Fix: use reference tables when values are expected to evolve
- **Ghost file**
  - Smell: DB stores only file path while lifecycle, backup, authorization, and restore are unmanaged
  - Fix: use BLOB/object storage only with clear lifecycle, permissions, and backup/restore design
- **Index shotgun**
  - Smell: indexes created by guesswork or duplicated heavily
  - Fix: MENTOR the indexes:
    - Measure
    - Explain
    - Nominate
    - Test
    - Optimize
    - Rebuild

## 4. Query Design

- **NULL misuse**
  - Smell: NULL treated like a normal value or fake defaults used instead of NULL
  - Fix: use `IS NULL`, `IS NOT NULL`, `COALESCE`, and proper nullability semantics
- **Ambiguous grouping**
  - Smell: non-grouped columns selected beside aggregates
  - Fix: obey the single-value rule or rewrite with proper subqueries/window logic
- **Random sort**
  - Smell: `ORDER BY RAND()` on large tables
  - Fix: choose a sampling strategy that avoids full random sort
- **Poor-man's search engine**
  - Smell: broad `%keyword%` or regex search on large text
  - Fix: use full-text indexing or a dedicated search engine
- **Spaghetti query**
  - Smell: forcing every rule into one giant SQL statement
  - Fix: split into stages or multiple queries when it improves correctness and maintainability
- **Implicit columns**
  - Smell: `SELECT *` in production paths
  - Fix: list required columns explicitly

## 5. Application Integration

- **Readable passwords**
  - Smell: plaintext or reversibly encrypted passwords
  - Fix: salted one-way hash, secure transport, no password recovery by disclosure
- **SQL injection**
  - Smell: unchecked input concatenated into SQL
  - Fix: parameterized queries, allowlists for dynamic clauses, code review
- **Surrogate-key gap paranoia**
  - Smell: trying to fill deleted ID gaps
  - Fix: treat surrogate keys as identifiers, not row numbers
- **Ignoring DB return values and executed SQL**
  - Smell: no error handling or no visibility into actual SQL
  - Fix: inspect execution results, logs, and emitted SQL
- **SQL exempt from engineering discipline**
  - Smell: no version control, no tests, no review, no documentation
  - Fix: treat DB artifacts like code
- **Active Record leakage**
  - Smell: controllers calling persistence methods directly
  - Fix: keep service/repository boundaries and avoid mixing UI flow with DB access

## 6. Vector Retrieval / RAG

- **Treating vector DB as a drop-in search replacement**
  - Smell: lexical search removed entirely after adding embeddings
  - Fix: default to hybrid retrieval; combine keyword/BM25, metadata filters, ANN, and reranking
- **Embedding everything without a versioning strategy**
  - Smell: multiple models/dimensions/chunk rules with no metadata or migration plan
  - Fix: version embeddings like schema, store model/config metadata, and support rolling re-embedding
- **Ignoring chunking strategy**
  - Smell: fixed-size chunks and naive splitting regardless of document structure
  - Fix: use document-aware chunking, overlap, and hierarchical structure where needed
- **Blind ANN defaults**
  - Smell: HNSW/IVF/PQ defaults used without recall-latency benchmarking
  - Fix: benchmark real workloads and tune ANN parameters deliberately
- **Assuming metadata filtering scales for free**
  - Smell: high-cardinality filters piled onto vector search with no partition strategy
  - Fix: design pre-segmentation, partitioned indexes, or tenant-specific collections where justified
- **Unplanned real-time writes**
  - Smell: continuous inserts degrade recall and latency over time
  - Fix: design buffers, compaction, tiered indexing, and ingestion/search separation
- **Storing raw documents in vector DB**
  - Smell: vector DB becomes source of truth for full documents
  - Fix: keep canonical docs in object store or primary DB and store references in vector DB
- **No retrieval observability**
  - Smell: only infra metrics exist; no recall or semantic drift monitoring
  - Fix: add golden queries, offline evaluation, drift tracking, and token-waste monitoring
- **Over-trusting similarity scores**
  - Smell: one cosine threshold is treated as universally meaningful
  - Fix: calibrate per domain/model and use reranking for precision-sensitive cases
- **Believing vector DB equals RAG architecture**
  - Smell: only index layer is optimized while chunking, orchestration, and feedback loop are ignored
  - Fix: treat retrieval as a pipeline, not a single database decision

```

### resources/vector-db.md

```markdown
# DB Agent - Vector DB & RAG Design Guide

Use this file when the task involves vector databases, semantic retrieval, ANN tuning, or RAG.

## Core Position

- Vector DB is not a drop-in replacement for search engines or relational databases.
- Vector DB is usually an **index layer** for semantic retrieval.
- Canonical documents should live in an object store or primary database.
- Production retrieval systems are usually **hybrid**:
  - lexical retrieval for exactness and explainability
  - vector retrieval for semantic proximity
  - metadata filtering for governance and scope
  - reranking for precision

## Design Sequence

1. Define the retrieval problem
   - semantic discovery
   - exact term retrieval
   - compliance / tenant filtering
   - recommendation
   - agent context retrieval
2. Decide whether vector retrieval is justified
3. Design chunking and embedding strategy
4. Choose ANN index and benchmark it
5. Design filtering, reranking, and observability
6. Plan lifecycle:
   - re-embedding
   - re-indexing
   - hot/cold vector tiers
   - backup / restore

## Best Practices

### 1. Use Hybrid Retrieval by Default

- Keep lexical retrieval when:
  - exact term match matters
  - user expects deterministic ranking
  - compliance or audit filters must be explicit
  - explainability matters
- Good default pipeline:
  - filter scope
  - lexical retrieval
  - ANN retrieval
  - merge / dedupe
  - rerank
  - context compression

### 2. Govern Embeddings Like Schema

Store in metadata:

- embedding model name
- dimension
- normalization method
- chunking policy version
- preprocessing version
- creation timestamp

Plan for:

- rolling re-embedding
- shadow indexes
- side-by-side evaluation
- backfill and cutover

### 3. Chunking is a First-Class Design Decision

Choose chunking based on document type:

- contracts / policies: heading-aware chunks with overlap
- tickets / chats: conversation-turn boundaries
- code / config: syntax-aware or file-structure-aware chunking
- FAQs / short docs: smaller direct-answer chunks

Watch for:

- over-chunking -> weak context and low recall
- under-chunking -> noisy retrieval and token waste

### 4. Tune ANN with Real Data

Common ANN choices:

- HNSW: strong recall, high memory
- IVF: faster partitioned retrieval, recall tradeoffs
- PQ: smaller memory footprint, lower precision
- DiskANN: disk-oriented large-scale retrieval

Benchmark on production-like distributions, not synthetic random vectors.

Measure:

- recall@k
- latency p50 / p95 / p99
- memory footprint
- index build time
- update latency

### 5. Design Filtering Deliberately

Metadata filtering gets expensive with high-cardinality dimensions.

Check:

- whether filters are pushed down before ANN or after
- whether approximate filtering affects correctness
- whether tenant / user / region isolation needs separate partitions

Patterns:

- partition by tenant or region
- pre-segment indexes for extreme-cardinality filters
- maintain dedicated collections for very large tenants

### 6. Plan Writes and Lifecycle

High-write vector workloads need:

- ingestion buffers
- async indexing
- compaction
- periodic rebuild strategy
- separation of ingestion and search responsibilities when load is high

Consider:

- hot vectors in fast mutable index
- cold vectors in slower compacted tier

### 7. Observe Semantics, Not Just Infra

Track more than QPS and latency:

- recall drift
- semantic drift after model update
- similarity score distribution
- false positives
- empty-result rate
- prompt-token waste from irrelevant retrieval

Maintain:

- golden query sets
- offline evaluation pipelines
- before/after comparison when changing model or chunking

## Review Questions

- Why is vector retrieval needed here?
- Why is pure lexical retrieval insufficient?
- Why is pure vector retrieval insufficient?
- What is the source of truth?
- What invalidates the embeddings?
- How will re-embedding be rolled out safely?
- How are filters enforced?
- How is retrieval quality measured over time?

```

### resources/iso-controls.md

```markdown
# DB Agent - ISO 27001 / 27002 / 22301 Guide

Use this file when the task involves database controls, audit readiness, resilience, or continuity requirements.

## Positioning

- **ISO 27001**: management-system and control objectives view
- **ISO 27002**: practical control guidance view
- **ISO 22301**: business continuity and recovery view

Do not claim compliance from schema work alone. Use these standards to improve database design decisions, control coverage, and operational evidence.

## 1. ISO 27001 / 27002 for Databases

Recommend improvements when the database design weakens:

- access control and least privilege
- separation of duties for admin, operator, developer, and read-only access
- encryption at rest and in transit
- key and secret management
- audit logging and tamper-resistant log retention
- change management and schema migration traceability
- backup security and retention handling
- secure configuration and baseline hardening
- data classification, masking, and PII handling

Typical DB-level suggestions:

- role-based access with least privilege
- dedicated break-glass admin path
- column or application-layer protection for sensitive fields where needed
- immutable or centralized audit log export
- secret rotation and non-human service identities
- explicit approval path for destructive schema changes

## 2. ISO 22301 for Databases

Recommend improvements when the design weakens business continuity:

- RTO / RPO definition
- failover readiness
- backup coverage and retention
- restore validation frequency
- dependency mapping for upstream/downstream systems
- data replication strategy
- hot/warm/cold standby decisions
- archival and cold-data recovery assumptions

Typical DB-level suggestions:

- define tiered continuity expectations by database criticality
- document backup cadence aligned to RPO
- document restore and failover process aligned to RTO
- test restore drills regularly, not only backup completion
- make derived indexes rebuildable from canonical data
- separate source-of-truth data from caches and secondary indexes

## 3. Review Questions

Ask these when standards matter:

- Who can read, write, alter schema, and administer the database?
- Are privileged actions logged and retained?
- Are secrets and keys managed outside code and migration files?
- What evidence exists that backups can actually be restored?
- What is the target RPO and RTO?
- Which systems depend on this database to continue service?
- Can cold/archive data be recovered within required time?
- Are vector indexes or search indexes rebuildable from canonical data?

## 4. Output Pattern

When relevant, add a short section like:

```md
## ISO Control Notes

### ISO 27001 / 27002
- Access control gap:
- Logging / traceability gap:
- Encryption / secret gap:
- Recommended control:

### ISO 22301
- Criticality:
- RPO / RTO gap:
- Backup / restore gap:
- Failover / dependency gap:
- Recommended continuity improvement:
```

## 5. Guardrails

- Keep standards comments practical and database-specific
- Prefer implementable controls over abstract policy language
- Distinguish between source-of-truth data and rebuildable derived data
- If the issue is infrastructure-only, involve `tf-infra-agent`
- If the issue is org-wide audit/process ownership, involve QA or PM

```

### resources/checklist.md

```markdown
# DB Agent - Self-Verification Checklist

Run through every item before submitting your work.

## Modeling
- [ ] DB type selection justified from workload, not preference
- [ ] Relational models normalized to at least 3NF, or denormalization explicitly justified
- [ ] NoSQL models aligned to concrete access patterns and aggregate boundaries
- [ ] Vector retrieval is used only where semantic similarity is actually required
- [ ] Keys, cardinality, and ownership clearly defined
- [ ] No obvious anti-patterns remain: CSV-in-column, repeated columns, EAV, polymorphic FK, missing FK

## Schema Documentation
- [ ] External schema documented
- [ ] Conceptual schema documented
- [ ] Internal schema documented
- [ ] Naming standards and field definitions documented
- [ ] Glossary / terminology dictionary updated

## Integrity & Concurrency
- [ ] Entity, domain, referential, and business-rule integrity addressed
- [ ] Transaction boundaries documented
- [ ] Isolation level selected for critical flows
- [ ] Locking / optimistic concurrency strategy documented
- [ ] Transparency assumptions called out when using distributed systems

## Performance & Operations
- [ ] Indexes match hot queries and join/filter patterns
- [ ] Partitioning / sharding strategy justified if introduced
- [ ] Hot vs cold data separation considered
- [ ] Full and incremental backup strategy documented
- [ ] Restore test or recovery validation approach documented
- [ ] Query anti-patterns reviewed: `SELECT *`, random sort, ambiguous grouping, full wildcard search without proper tooling

## Vector Retrieval
- [ ] Vector DB is not being used as the canonical document store
- [ ] Hybrid retrieval considered where exact match, explainability, or compliance filtering matter
- [ ] Embedding model, dimension, normalization, chunking, and version metadata documented
- [ ] Re-embedding / re-index migration plan exists for model or chunking changes
- [ ] ANN index choice and tuning parameters benchmarked on production-like data
- [ ] Metadata filtering behavior and high-cardinality filter impact understood
- [ ] Similarity score thresholds calibrated per domain or reranking added for precision-critical paths
- [ ] Retrieval observability defined: golden queries, recall drift, semantic drift, token waste

## Capacity
- [ ] Capacity sheet updated by object
- [ ] Tablespace/storage estimate updated
- [ ] Disk estimate includes online, batch, backup, and growth assumptions
- [ ] Data type choices reviewed for storage efficiency

## Delivery
- [ ] Assumptions, tradeoffs, and risks are explicit
- [ ] Migration impact or rollout steps included when schema changes are proposed
- [ ] ISO 27001 / 27002-friendly suggestions included where design affects access control, logging, encryption, backup, recovery, or change management
- [ ] ISO 22301-friendly suggestions included where design affects resilience, failover, restore testing, RTO, or RPO

```

### resources/error-playbook.md

```markdown
# DB Agent - Error Playbook

## Symptom: Relational model is breaking 3NF everywhere
- Re-check whether reporting columns, derived values, and snapshots are being mixed into OLTP tables
- Split transactional source-of-truth from read models or marts
- Keep denormalization only where query cost justifies it

## Symptom: NoSQL model keeps requiring cross-document joins
- Re-evaluate aggregate boundaries and access patterns
- If strong cross-entity consistency dominates, switch recommendation toward relational storage

## Symptom: Isolation level is unclear
- Start from anomaly prevention required by the business flow
- Document what must be prevented: dirty read, non-repeatable read, phantom, lost update
- Choose the lowest level that still blocks unacceptable anomalies

## Symptom: Capacity estimate is unreliable
- Separate online traffic, batch traffic, retention, and backup retention
- Estimate by object first, then aggregate to tablespace and disk totals
- Add growth and reindex/maintenance headroom

## Symptom: Backup plan exists but restore confidence is low
- Add restore drill frequency
- Define sample recovery scenarios and max tolerable data loss
- Distinguish snapshot/full backup from incremental/log-based recovery

```