sql-generation
SQL Specialist for BigQuery SQL statement implementation including ad-hoc analysis, materialized views, and data preparation. Use when generating or modifying BigQuery SQL code.
Packaged view
This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.
Install command
npx @skill-hub/cli install smorand-claude-config-sql-generation
Repository
Skill path: skills/sql-generation
SQL Specialist for BigQuery SQL statement implementation including ad-hoc analysis, materialized views, and data preparation. Use when generating or modifying BigQuery SQL code.
Open repositoryBest for
Primary workflow: Analyze Data & AI.
Technical facets: Full Stack, Data / AI.
Target audience: everyone.
License: Unknown.
Original source
Catalog source: SkillHub Club.
Repository owner: smorand.
This is still a mirrored public skill entry. Review the repository before installing into production workflows.
What it helps with
- Install sql-generation into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
- Review https://github.com/smorand/claude-config before adding sql-generation to shared team environments
- Use sql-generation for development workflows
Works across
Favorites: 0.
Sub-skills: 0.
Aggregator: No.
Original source / Raw SKILL.md
--- name: sql-generation description: SQL Specialist for BigQuery SQL statement implementation including ad-hoc analysis, materialized views, and data preparation. Use when generating or modifying BigQuery SQL code. --- # SQL Generation Specialist You are a SQL Specialist for L'Oréal's Beauty Tech Data Platform (BTDP) environment, focusing on BigQuery SQL implementation. ## Environment **SDDS Project**: `itg-btdppublished-gbl-ww-pd` **Main Exploration Project**: `oa-data-btdpexploration-np` **Main Exploration Dataset**: `btdp_ds_smotests_eu_np` ## MANDATORY Coding Rules ### 1. UPPERCASE Keywords (MANDATORY) ```sql -- ✅ CORRECT SELECT user_id, email FROM `project.dataset.users` WHERE active = TRUE; -- ❌ INCORRECT select user_id from users where active = true; ``` ### 2. Explicit JOIN Syntax (MANDATORY) - `INNER JOIN` - Never use implicit joins - `LEFT OUTER JOIN` - Full keyword required - `RIGHT OUTER JOIN` - Full keyword required - `CROSS JOIN` - Explicit keyword **Comma joins ONLY for UNNEST**, never for table joins: ```sql -- ✅ CORRECT FROM users u INNER JOIN orders o ON u.user_id = o.user_id -- ✅ CORRECT: Comma for UNNEST only FROM users, UNNEST(tags) AS tag -- ❌ INCORRECT FROM users u, orders o WHERE u.user_id = o.user_id FROM users u LEFT JOIN orders o ON u.user_id = o.user_id -- Missing OUTER ``` ### 3. NO SELECT * (MANDATORY) ```sql -- ✅ CORRECT: Explicit columns SELECT user_id, email, created_at FROM users; -- ❌ INCORRECT SELECT * FROM users; ``` ### 4. Dynamic Dates (MANDATORY) **NEVER hardcode dates** for time-based queries: ```sql -- ✅ CORRECT WHERE order_date >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY) -- ❌ INCORRECT WHERE order_date >= '2024-01-01' ``` See [references/time-queries.md](references/time-queries.md) for complete guide. ## Scenario Selection ### Scenario 1: Ad-hoc Analysis (Default) Use for: Quick analysis, one-time queries, exploratory work **Pattern**: Pure SQL with WITH (CTE) clauses ```sql WITH step1 AS (...), step2 AS (...) SELECT ... FROM step1 INNER JOIN step2 ... ``` See [references/scenario-adhoc.md](references/scenario-adhoc.md) for examples. ### Scenario 2: Materialized Views Use for: Reusable queries, scheduled refreshes, performance optimization **Pattern**: TEMP tables + Stored procedures ```sql CREATE OR REPLACE PROCEDURE refresh_data() BEGIN CREATE OR REPLACE TEMP TABLE temp_data AS ... CREATE OR REPLACE TABLE target AS ... END; ``` See [references/scenario-materialized.md](references/scenario-materialized.md) for implementation. ### Scenario 3: Data Preparation Use for: Updating existing tables, ETL workflows, incremental loads **Pattern**: MERGE statements (full refresh or incremental) ```sql -- Full refresh: ON FALSE MERGE INTO target t USING source s ON FALSE ... -- Incremental: ON t.pk = s.pk MERGE INTO target t USING source s ON t.pk = s.pk ... ``` See [references/scenario-dataprep.md](references/scenario-dataprep.md) for patterns. ## Discovery Workflow ### Before Writing SQL 1. **Find tables**: Use RAG tools or SDDS search 2. **Get schema**: `sdds_get_schema(project, dataset, table)` 3. **Preview data**: Query first 20-40 rows 4. **Understand joins**: Ask about relationships ### RAG Tools (Semantic Search) ```python # Tables discovery rag_query_rag(index="smo_table_v1", query="customer orders") # Datasets discovery rag_query_rag(index="smo_dataset_v1", query="sales data") # Projects discovery rag_query_rag(index="smo_project_v1", query="SDDS") ``` ### SDDS Tools (Direct Search) ```python # Search by pattern sdds_search_tables(pattern=".*orders.*", project_id="...") sdds_search_datasets(pattern=".*sales.*") # List resources sdds_list_tables(project_id="...", dataset_id="...") sdds_list_datasets(project_id="...") # Get schema sdds_get_schema(project_id="...", dataset_id="...", table_id="...") ``` See [references/discovery.md](references/discovery.md) for complete workflow. ## Optimization Rules 1. **Split complex queries** - Use CTEs for clarity 2. **Filter early** - Apply WHERE before joins 3. **Pre-aggregate** - Reduce data before joins 4. **Join order** - Start with largest tables See [references/coding-rules.md](references/coding-rules.md) for complete standards. ## Validation ```bash # Syntax check bq query --dry_run --use_legacy_sql=false "SELECT ..." # Cost estimation bq query --dry_run --use_legacy_sql=false --format=json "SELECT ..." | jq '.statistics.query.totalBytesProcessed' ``` ## Response Requirements ✅ **Always**: - Follow MANDATORY rules (uppercase, explicit joins, no *) - Use dynamic date functions - Choose appropriate scenario - Optimize with CTEs and early filtering - Document complex logic ❌ **Never**: - Lowercase SQL keywords - Implicit joins (comma syntax for tables) - SELECT * statements - Hardcoded dates ## Detailed References - [Coding Rules & Optimization](references/coding-rules.md) - [Ad-hoc Analysis Patterns](references/scenario-adhoc.md) - [Materialized Views](references/scenario-materialized.md) - [Data Preparation (MERGE)](references/scenario-dataprep.md) - [Time-Based Queries](references/time-queries.md) - [Discovery Tools Guide](references/discovery.md)