course-study
Comprehensive course study, exam revision, and structured study note generation from lecture slides, course PDFs, or topic outlines. Use when the user wants to study a course, review or summarize lecture slides/PDFs, generate lecture notes or a study guide, prepare for exams (midterm, final, quiz), create revision notes or formula sheets, extract key concepts/definitions/formulas from course materials, synthesize knowledge across multiple lectures, expand understanding beyond course scope with external sources, or asks for help learning, reviewing, or revising any university or college course content in any academic subject (CS, engineering, math, science, humanities, etc.). Produces deep, multi-format study notes (Markdown + PDF) with worked examples, LaTeX formulas, cross-lecture connections, and concept dependency maps — not shallow summaries.
Packaged view
This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.
Install command
npx @skill-hub/cli install 2568574560-course-study
Repository
Comprehensive course study, exam revision, and structured study note generation from lecture slides, course PDFs, or topic outlines. Use when the user wants to study a course, review or summarize lecture slides/PDFs, generate lecture notes or a study guide, prepare for exams (midterm, final, quiz), create revision notes or formula sheets, extract key concepts/definitions/formulas from course materials, synthesize knowledge across multiple lectures, expand understanding beyond course scope with external sources, or asks for help learning, reviewing, or revising any university or college course content in any academic subject (CS, engineering, math, science, humanities, etc.). Produces deep, multi-format study notes (Markdown + PDF) with worked examples, LaTeX formulas, cross-lecture connections, and concept dependency maps — not shallow summaries.
Open repositoryBest for
Primary workflow: Write Technical Docs.
Technical facets: Full Stack, Tech Writer.
Target audience: everyone.
License: MIT.
Original source
Catalog source: SkillHub Club.
Repository owner: 2568574560.
This is still a mirrored public skill entry. Review the repository before installing into production workflows.
What it helps with
- Install course-study into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
- Review https://www.skillhub.club/skills/2568574560-course-study before adding course-study to shared team environments
- Use course-study for development workflows
Works across
Favorites: 0.
Sub-skills: 0.
Aggregator: No.
Original source / Raw SKILL.md
---
name: course-study
description: Comprehensive course study, exam revision, and structured study note generation from lecture slides, course PDFs, or topic outlines. Use when the user wants to study a course, review or summarize lecture slides/PDFs, generate lecture notes or a study guide, prepare for exams (midterm, final, quiz), create revision notes or formula sheets, extract key concepts/definitions/formulas from course materials, synthesize knowledge across multiple lectures, expand understanding beyond course scope with external sources, or asks for help learning, reviewing, or revising any university or college course content in any academic subject (CS, engineering, math, science, humanities, etc.). Produces deep, multi-format study notes (Markdown + PDF) with worked examples, LaTeX formulas, cross-lecture connections, concept dependency maps, quick-reference sheets, and exam Q&A appendix — not shallow summaries.
license: MIT
model: claude-sonnet-4-6
user-invocable: true
metadata:
version: 2.0.0
author: claude-code
domains:
- education
- study
- exam-prep
- learning
---
# Course Study v2.0
A structured four-phase workflow for deep learning of any university or college course: Extract → Synthesize → Expand → Study. Produces high-fidelity, multi-format study materials as a single, complete PDF — not shallow summaries.
---
## Phase 0: Intake
Read `rules/phase-intake.md` and run the full intake workflow. Keep this to one exchange — do not ask questions across multiple messages.
---
## Workflow
```
Phase 0: Intake (single exchange)
├── PDFs → Phase 1 (Extract via /pdf skill)
├── Topic list → Phase 2 directly
└── Course name → quick syllabus search → Phase 2
Phase 1: Extract (per PDF, page-aligned, using /pdf skill)
└── Output: lecture-XX-extract.md
Phase 2: Synthesize
└── Output: course-synthesis.md
Phase 3: Expand (web sources OR curriculum-grounded)
└── Output: course-expansion.md
Phase 4: Study Materials
├── study-notes.md (always)
├── quick-reference.md (Exam Ready only)
└── exam-qa.md (Exam Ready only)
```
Each phase ends with a **brief checkpoint** (see below).
---
## Phase Checkpoint
After each phase, one compact message:
```
✓ Phase [X] done — [summary in one line].
Issues? (coverage gaps / too shallow / too verbose)
Type to adjust, or just say "continue".
```
If no response issues → proceed immediately. No multi-question forms.
---
## Global Rules
1. **PDF-only input.** Use the `/pdf` skill to read all course files. Do not use Python file I/O or direct file reading for PDFs.
2. **Backbone fidelity.** Every concept traces back to its source: page number (PDF) or section number (topic list). Never lose traceability.
3. **Speed discipline.** Minimize round-trips. Batch questions. Skip steps that aren't needed for the current tier. Phase 1 and 2 intermediate files should be dense and compact — no padding, no repeated meta-commentary.
4. **No fabrication.** Phase 3 without web access: every claim marked `[Standard curriculum knowledge]`. No invented URLs, paper titles, or authors.
5. **Examples are mandatory.** Phase 4 must include worked examples for every non-trivial concept.
6. **Track progress.** Use TodoList to track which lectures have been processed.
7. **Multi-format output.** Final notes are written in format-agnostic Markdown per `rules/templates.md`. When the user requests PDF output, read `rules/pdf-export.md` for pandoc font configuration and CJK handling before converting.
8. **Prioritise flagged topics.** If the user named priority topics in Phase 0, give them deeper treatment in Phase 4 and ensure they appear in the Quick Reference and Exam Q&A.
---
## Phase 4 Output: Study Notes
The main study notes follow the structure in `rules/phase-study.md`. Every concept gets:
- **What it is** — definition, instructor's phrasing first
- **Intuition** — why it exists, what problem it solves
- **Formal treatment** — LaTeX formulas or code blocks
- **Worked example** — concrete, step-by-step
- **Connections** — prerequisites and what this enables
- **Common misconceptions**
Exam Ready appendices (Quick Reference Sheet and Exam Q&A) are generated in Phase 4 as well — see `rules/phase-study.md` Steps 6a and 6b.
---
## Reference Files
- `rules/phase-intake.md` — Phase 0 intake workflow
- `rules/phase-extract.md` — Phase 1
- `rules/phase-synthesize.md` — Phase 2
- `rules/phase-expand.md` — Phase 3
- `rules/phase-study.md` — Phase 4 (study notes + Exam Ready appendices)
- `rules/templates.md` — Format-agnostic writing rules
- `rules/pdf-export.md` — PDF conversion config (load only when PDF output is requested)
- `rules/subject-coverage.md` — Live search strategy for curriculum gap analysis
- `rules/changelog.md` — Version history
---
## Anti-Patterns
| Avoid | Why | Instead |
|-------|-----|---------|
| Skipping worked examples | Students fail on application, not definitions | Mandatory for every non-trivial concept |
| Quick Reference with prose | Defeats the purpose | One line per entry maximum |
| Exam Q&A without source refs | Student can't verify or dig deeper | Every answer cites source location |
| Ignoring Phase 0 priority topics | User told you what matters | Deeper treatment + appears in all appendices |
| Fabricating exam question styles | Misleads preparation | Draw only from what the course actually covers |
---
## Referenced Files
> The following files are included in this skill.
### rules/phase-study.md
```md
# Phase 4: Structured Deep Study Notes
## Goal
Produce the **final, comprehensive study notes** integrating all prior phases into a single authoritative document. This is the deliverable the student will actually use.
## What this phase is — and isn't
Phase 4 is a creative synthesis, not a rewrite of Phase 1. It:
- Follows the **backbone structure** (lecture/section order from Phase 1) as the skeleton
- Weaves in **cross-lecture connections** from Phase 2 at the right moments
- Integrates **external knowledge** from Phase 3 where it deepens understanding
- Adds **concrete worked examples** for every non-trivial concept
- Maintains **concise, precise technical writing** — not verbose, not skeletal
The backbone from Phase 1 is the spine. Phases 2 and 3 are the flesh.
## Procedure
### Step 1: Build the document skeleton
Follow the backbone structure strictly. Every backbone unit (lecture page group or section) becomes a heading. A reader can hold the original slides / topic list in one hand and the study notes in the other and see a 1:1 correspondence.
```markdown
# [Course Name] — Study Notes
## About these notes
[What these notes cover, how they're structured, what sources they draw from]
## Lecture 1 / Module 1: [Title]
### 1.1 [Topic] ([source location])
...
### 1.2 [Topic] ([source location])
...
```
Source location format — use whichever applies:
- PDF input: `(Lecture 1, pp. 3-7)`
- Topic list input: `(Section 1.2)`
### Step 2: Write each concept block
Calibrate depth to concept importance:
```markdown
### [Concept Name] ([source location])
**What it is:** [1-3 sentence definition. Instructor's phrasing first, then clarification.]
**Intuition:** [Why does this exist? What problem does it solve? Use analogies for complex ideas.]
**Formal treatment:**
[Formula / algorithm in LaTeX or code block. Explain each symbol/variable.]
**Example:**
[Concrete, worked example. For algorithms: step-by-step trace. For formulas: plug in numbers.]
**Real-world application:** [From Phase 3. How used in industry? Specific tools, cases.]
**Connections:** [From Phase 2. Prerequisites + what this concept enables.]
**Common misconceptions:** [What do students typically get wrong?]
**References:** [source location + Phase 3 source if used]
```
| Concept weight | Required sections |
|---|---|
| Core / exam-critical | All seven |
| Important supporting | What + Formal + Example + Connections |
| Minor / context | What + brief note |
### Step 3: Insert cross-lecture bridges
At every lecture/module boundary, insert a bridge:
```markdown
---
> **Bridge:** Lecture 3 introduced the problem of [X]. Lecture 4 now presents [Y] as a solution. The key shift is from [understanding the problem] to [designing a mechanism]. Note that [Y] assumes [Z] from Lecture 2 ([source location]).
---
```
Bridges transform a set of notes into a coherent learning narrative.
### Step 4: Verify completeness (fast pass)
Run a quick cross-reference — do not re-read everything in full:
- Spot-check: do the Phase 1 concept lists match headings in study notes? Flag any missing.
- Confirm no `[TODO]` placeholders remain.
- Confirm Phase 3 expansion entries are embedded (not appended at the end).
- Source locations present on all concepts? If any are missing, add from Phase 1.
### Step 5: Format output
Write in format-agnostic Markdown per [templates.md](templates.md). Generate `study-notes.md` as primary output. For PDF export, apply the `/pdf` skill — do not use Python or pandoc directly.
### Step 6: Generate Exam Ready appendices (Exam Ready mode only)
If the user selected **Exam Ready** in Phase 0 — or requests it now for the first time — generate two additional files after `study-notes.md` is complete. Do not restart earlier phases; generate the appendices from the outputs already produced.
#### 6a: quick-reference.md
A compact reference card designed for last-minute review or open-book use.
Rules:
- Every entry fits in one table row or one bullet — no prose paragraphs
- Ordered by **exam relevance** (most likely to appear on exam first), not lecture order
- Infer exam relevance from: Phase 2 course narrative, Phase 2 gap flags, [EXPAND] markers from Phase 1, and user-specified priority topics from Phase 0
- Source refs omitted (speed over traceability in this document)
Structure:
```markdown
# Quick Reference: [Course Name]
## Formulas
| Name | Formula | Notes |
|------|---------|-------|
| [Name] | $[LaTeX]$ | [When to use / key constraint] |
## Key Definitions
- **[Term]:** [One sentence]
## Algorithms
| Name | Time | Space | Key idea |
|------|------|-------|----------|
| [Name] | O(?) | O(?) | [One line] |
## Common Traps
- [Specific mistake students make and what the correct behaviour is]
## X vs Y (Decision Tables)
[When you have two or more easily confused concepts, a two-column comparison table]
```
#### 6b: exam-qa.md
A bank of exam-style questions with full worked solutions.
Question sourcing rules (in priority order):
1. **Phase 2 gaps and [EXPAND] markers** — concepts the course treated superficially are prime exam targets
2. **User-specified priority topics** from Phase 0
3. **Cross-cutting concepts** from Phase 2 synthesis (themes that span multiple lectures)
4. **Core definitions** — one definition question per major concept
Question type selection by subject:
| Subject type | Emphasis |
|---|---|
| Math / engineering / CS algorithms | Worked problems (show-your-work format) |
| CS systems / architecture | Compare-and-contrast + design scenarios |
| Sciences | Explain-the-phenomenon + calculation |
| Humanities / social sciences | Short essay + source interpretation |
| Mixed | All types, weighted by lecture content |
Format per question:
```markdown
### Q[N]: [Short title]
**Type:** [Definition / Worked problem / Compare / Design / Essay]
**Likely exam weight:** [High / Medium / Low]
[Question text — specific, unambiguous, matches the difficulty of the course]
**Answer:**
[Full solution. For worked problems: show every step. For compare: use a structured table or bullet pairs. For essay: a model answer outline.]
**Common mistake:** [What students typically get wrong on this question]
**Source:** [Lecture X, pp. Y-Z or Section X.Y]
```
Minimum coverage: 3 questions per major topic. At least one worked problem if the subject involves calculation or algorithms.
## Writing style
**Concise but complete.** Every sentence earns its place. Cut filler, keep substance.
**Active voice.** "The scheduler assigns processes" not "Processes are assigned."
**Technical precision.** Correct terminology. Define on first use.
**Progressive disclosure.** Within each concept: intuition first (accessible) → formal (precise) → example (concrete) → connections (advanced). A reader can stop at any level and still learn something.
**Code and formulas are first-class.** Format correctly, comment where non-obvious, never truncate.
## Anti-patterns
- **DO NOT** produce study notes shorter than Phase 1 extractions. Phase 4 is longer because it adds examples, connections, and explanations.
- **DO NOT** skip examples. "Definition without example" is the single most common failure mode.
- **DO NOT** break the backbone order. Forward/backward links are fine; the primary structure must follow it.
- **DO NOT** omit Phase 3 expansion content if Phase 3 was run.
- **DO NOT** write walls of text — use headings, code blocks, and formula blocks for visual rhythm.
- **DO NOT** produce content that only works in Markdown — follow templates.md format rules throughout.
- **DO NOT** (Exam Ready) write prose in quick-reference.md — it is a lookup table, not a summary.
- **DO NOT** (Exam Ready) write trivial questions in exam-qa.md — each question must require genuine understanding to answer.
- **DO NOT** (Exam Ready) generate questions only from easy, well-covered topics — the most valuable questions come from Phase 2 gaps and [EXPAND] markers.
```
### rules/templates.md
```md
# Output Templates and Multi-Format Rules
## Format-agnostic writing rules
The study notes must render correctly in three contexts: Markdown viewers, PDF conversion, and plain text reading. Follow these rules to ensure compatibility.
### Headings
Use ATX-style headings (`#`, `##`, `###`). Maximum depth: 4 levels.
Always leave a blank line before and after headings.
```markdown
## Lecture 1: Introduction
### 1.1 What is an Operating System?
```
### Formulas
Use LaTeX wrapped in `$...$` (inline) or `$$...$$` (display block).
For PDF conversion compatibility, avoid overly complex LaTeX that requires special packages. Stick to standard math notation.
```markdown
The time complexity is $O(n \log n)$.
$$
P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)}
$$
```
For plain text fallback: after each formula block, optionally include a plain-language description:
```markdown
$$E = mc^2$$
(Energy equals mass times the speed of light squared)
```
### Code blocks
Always specify the language for syntax highlighting:
````markdown
```python
def binary_search(arr, target):
lo, hi = 0, len(arr) - 1
while lo <= hi:
mid = (lo + hi) // 2
if arr[mid] == target:
return mid
elif arr[mid] < target:
lo = mid + 1
else:
hi = mid - 1
return -1
```
````
For long code blocks (>30 lines), add a brief comment header explaining what the code does.
### Tables
Use standard Markdown tables. Keep columns reasonably narrow for PDF rendering.
```markdown
| Algorithm | Time (avg) | Time (worst) | Space | Stable? |
|-----------|-----------|-------------|-------|---------|
| Quicksort | O(n log n) | O(n^2) | O(log n) | No |
| Mergesort | O(n log n) | O(n log n) | O(n) | Yes |
```
For tables wider than 5 columns, consider splitting into multiple tables or using a description list format instead.
### Diagrams and figures
Since we're targeting text-based formats, represent diagrams as:
1. **ASCII art** for simple structures (trees, linked lists):
```
[Root]
/ \
[Left] [Right]
```
2. **Mermaid code blocks** for complex flows (supported by many Markdown renderers):
```mermaid
graph TD
A[Client] --> B[Load Balancer]
B --> C[Server 1]
B --> D[Server 2]
```
3. **Textual description** as fallback: "Figure: A directed graph with nodes A, B, C where A points to B and C, and B points to C. This represents..."
Always provide the textual description even when including Mermaid/ASCII art — it ensures plain text readability.
### Cross-references
Use relative section references, not Markdown-specific link anchors:
Good: "See Section 3.2 (Lecture 3, Process Scheduling) for details."
Avoid: "See [here](#32-process-scheduling)." (anchor links may break in PDF)
### Emphasis
Use **bold** for key terms on first definition. Use *italic* for emphasis.
Do NOT use bold for entire paragraphs or sentences — it loses its signal value.
### Page breaks (PDF-aware)
When the content logically separates (e.g., between lectures), insert a horizontal rule:
```markdown
---
```
This renders as a visual break in Markdown and can be mapped to a page break in PDF conversion.
## Document structure template
```markdown
# [Course Name] — Study Notes
> Generated from [N] lecture PDFs + external knowledge expansion.
> Last updated: [date]
> Source lectures: [list of PDF filenames]
## Table of Contents
[Auto-generated or manually maintained TOC]
---
## Lecture 1: [Title]
### 1.1 [Topic] (pp. X-Y)
[Content following the concept block template from phase-study.md]
### 1.2 [Topic] (pp. Y-Z)
[...]
> **Bridge:** [Transition note to next lecture]
---
## Lecture 2: [Title]
[...]
---
## Appendix A: Concept Index
[Alphabetical list of all concepts with lecture + page references]
## Appendix B: External Sources
[All URLs, paper citations, and documentation references used in expansion]
## Appendix C: [EXPAND] Resolution Log
[List of all [EXPAND] markers from Phase 1 and how each was resolved]
```
## Format-specific conversion notes
### Markdown (.md)
Primary output format. No special handling needed if the rules above are followed.
### PDF conversion
When converting to PDF, read `rules/pdf-export.md` for font configuration, pandoc commands, and CJK/bilingual document handling. Use the `/pdf` skill to execute the conversion — do not use Python or pandoc directly.
---
### Plain text readability
The document must be readable with all Markdown stripped:
- Headings are recognizable by their `#` prefixes
- Formulas have plain-language descriptions nearby
- Code is in fenced blocks
- No meaning carried solely by formatting — don't use bold as the only signal for key terms; also use "Key term: …" phrasing
```
### rules/phase-extract.md
```md
# Phase 1: Backbone Extraction
## Goal
Extract every concept, definition, formula, algorithm, code snippet, and diagram from the course PDFs, **strictly aligned to page numbers**. Zero information loss is the target.
## File reading rule
**Always use the `/pdf` skill to read PDF files.** Do not use Python, bash, or any direct file I/O. Invoke `/pdf` with the uploaded file, then work from its output.
If a file is not already a PDF, stop and ask the user to convert it using `/pdf` first before continuing.
## Backbone reference format
| Input type | Unit | Reference format |
|---|---|---|
| PDF / slide file | Page number | `Lecture X, p. Y` |
| Topic list | Section item | `Section X.Y` |
| Generated syllabus | Syllabus entry | `Module X, Section Y` |
Use the same format throughout all phases. Do not mix formats.
## Extraction procedure
### Step 1: Read via /pdf skill
Invoke the `/pdf` skill on the uploaded file. Note total page count from the output.
### Step 2: Page-by-page extraction
For EVERY page, output one block. Keep blocks dense — no padding:
```markdown
### Page X (of N) — [Slide title]
**Key content:**
- [Definitions — exact wording]
- [Formulas — $LaTeX$]
- [Algorithms — pseudocode or code block]
- [Diagrams — describe structure, name all labeled elements]
- [Tables — reproduce as Markdown]
- [Examples — reproduce in full]
**Concepts introduced:** [comma-separated]
**[EXPAND]:** [concept — reason] *(omit if nothing to flag)*
```
Empty/title pages still get a block:
```markdown
### Page 1 (of 30) — Course Title
**Key content:** [Title page — no substantive content]
**Concepts introduced:** —
```
### Step 3: Speed calibration by tier
Apply the tier set in Phase 0:
| Tier | Page handling |
|---|---|
| Light (≤60p) | Full block per page |
| Medium (61–200p) | Full block per page; write lecture summary after each lecture |
| Heavy (201–400p) | Full block per page; after each lecture, compress to concept-inventory format (concept + location + type only) before moving to next lecture |
For Heavy tier, the concept-inventory format is:
```
- [Concept name] | Lecture X, pp. Y-Z | [definition/theorem/algorithm/pattern]
```
This replaces the full block in memory but the full blocks are still written to the extract file.
### Step 4: Completeness check (fast)
```
Pages in file: N → "### Page X" blocks in output: must = N
```
Content-rich pages: each block ≥ 5 lines of substantive content. If shorter, re-read the page.
### Step 5: Lecture summary (append to extract file)
```markdown
## Lecture Summary
**Topic:** [main topic]
**Total pages:** N
**Key concepts (by appearance):** [numbered list — keep brief]
**Prerequisites assumed:** [from earlier lectures]
**[EXPAND] markers:** [list with page refs]
**Open questions:** [unclear or contradictory items]
```
---
## For topic list inputs
When the user pastes a topic list instead of a PDF:
1. Parse into hierarchy: Top level → Module, Second → Section, Third → Sub-concept
2. Assign numbers: `1.1`, `1.2`, `2.1`, etc.
3. Write `course-backbone.md` with the full outline
4. Show the user: "Does this structure look right?" — adjust if needed
5. Populate each section with world-knowledge concept blocks, marked `[From: world knowledge — no source PDF]`
---
## Output
- `lecture-XX-extract.md` per PDF (zero-padded)
- `course-backbone.md` if input was a topic list
## Anti-patterns
- **DO NOT** read PDFs with Python or direct file tools — use `/pdf` skill only
- **DO NOT** merge multiple pages into one block
- **DO NOT** skip pages you consider unimportant
- **DO NOT** pad with meta-commentary — dense content only
```
### rules/pdf-export.md
```md
# pdf-export — PDF Conversion Configuration
**Load this file only when the user has requested PDF output** (confirmed in Phase 0 or explicitly during Phase 4). For Markdown-only output, format rules in `rules/templates.md` are sufficient.
---
## When to generate PDF
After `study-notes.md` (and any appendices) are complete, apply the `/pdf` skill for PDF export. Do not use Python or pandoc directly — use `/pdf`.
---
## Detecting document language
Before generating PDF, check the study notes language:
- **English only:** use the standard Latin font stack. No special configuration needed.
- **Chinese only or mixed Chinese/English:** apply the CJK font configuration below. Without it, Chinese characters will render as blank boxes or tofu (□□□).
---
## Font stack for English-only documents
Pandoc default is sufficient. Recommended explicit config:
```yaml
# pandoc front matter or --variable flags
mainfont: "Georgia"
monofont: "Courier New"
```
---
## Font stack for Chinese or mixed documents
CJK fonts must be explicitly set. Use `xelatex` as the PDF engine (not `pdflatex`):
```yaml
# pandoc YAML front matter
mainfont: "Noto Serif" # Latin body text
CJKmainfont: "Noto Serif CJK SC" # Chinese body text (SC = Simplified; TC = Traditional)
monofont: "Noto Sans Mono" # Code blocks
sansfont: "Noto Sans"
CJKsansfont: "Noto Sans CJK SC"
```
Equivalent pandoc command:
```bash
pandoc study-notes.md -o study-notes.pdf \
--pdf-engine=xelatex \
-V mainfont="Noto Serif" \
-V CJKmainfont="Noto Serif CJK SC" \
-V monofont="Noto Sans Mono"
```
**Fallback fonts by platform** (if Noto is not installed):
| Platform | Chinese body | Chinese sans | Code |
|---|---|---|---|
| macOS | PingFang SC / STSong | PingFang SC | Menlo |
| Windows | SimSun / NSimSun | Microsoft YaHei | Consolas |
| Linux | Noto Serif CJK SC | Noto Sans CJK SC | DejaVu Sans Mono |
To check available fonts:
```bash
# macOS / Linux
fc-list :lang=zh | head -20
# or via pandoc
pandoc --list-fonts 2>/dev/null | grep -i "cjk\|song\|hei\|ping"
```
---
## Complete pandoc command for Chinese or mixed documents
Use this as the definitive template. The flags below address the most common failure modes (tofu characters, cramped lines, broken paragraphs):
```bash
pandoc study-notes.md -o study-notes.pdf \
--pdf-engine=xelatex \
-V mainfont="Noto Serif" \
-V CJKmainfont="Noto Serif CJK SC" \
-V monofont="Noto Sans Mono" \
-V fontsize=11pt \
-V geometry="margin=2.5cm" \
-V linestretch=1.8 \
-V parskip=6pt \
-V CJKoptions="AutoFakeBold=2,AutoFakeSlant=0.2" \
--listings
```
**Why each flag:**
| Flag | Value | Reason |
|------|-------|--------|
| `--pdf-engine=xelatex` | — | Only xelatex handles CJK reliably; pdflatex will break |
| `linestretch` | `1.8` | Chinese characters are visually denser than Latin; 1.4 feels cramped, 1.8 is comfortable for reading |
| `parskip` | `6pt` | Adds breathing room between paragraphs — without this, paragraph breaks are invisible in dense Chinese text |
| `geometry` | `margin=2.5cm` | Prevents lines from running too long; CJK at full width is hard to read |
| `fontsize` | `11pt` | 10pt is too small for Chinese body text |
| `CJKoptions` | `AutoFakeBold` | Makes bold (`**text**`) actually render as bold in Chinese; without it, bold markup is silently ignored |
| `--listings` | — | Better code block handling; prevents CJK in comments from breaking the PDF |
---
## Tuning linestretch by content type
| Content type | Recommended `linestretch` |
|---|---|
| Chinese-only body text | `1.8` |
| Mixed Chinese + English | `1.8` |
| Chinese + dense math/formulas | `2.0` (formulas need extra vertical space) |
| English-only | `1.3` (pandoc default is fine) |
If the user reports "lines are too close together", increase `linestretch` by 0.2 increments. If they report "too much whitespace", decrease by 0.1.
---
## YAML front matter alternative
Embed settings directly in the Markdown file to avoid long command lines:
```yaml
---
title: "[Course Name] — Study Notes"
mainfont: "Noto Serif"
CJKmainfont: "Noto Serif CJK SC"
monofont: "Noto Sans Mono"
fontsize: 11pt
geometry: "margin=2.5cm"
linestretch: 1.8
parskip: 6pt
CJKoptions: "AutoFakeBold=2,AutoFakeSlant=0.2"
---
```
Then run simply:
```bash
pandoc study-notes.md -o study-notes.pdf --pdf-engine=xelatex --listings
```
---
## Format-specific notes
### LaTeX formulas
Use standard notation only — no custom packages. `amsmath` is available by default in pandoc's LaTeX template. CJK text mixed into formula environments (e.g. Chinese variable names) will break — keep all formula content in Latin/Greek.
### Tables
Max 5 columns; keep cell content narrow. CJK text in table cells is wider than Latin — reduce column count further if cells contain Chinese. Long Chinese strings in table cells will overflow; break them with `<br>` if needed.
### Code blocks
The `--listings` flag handles code blocks correctly. CJK characters inside code blocks may still break — if course code contains Chinese comments, test the PDF output early and consider stripping comments for the PDF version.
### Page breaks
Horizontal rules (`---`) act as page break hints. Insert before each lecture section for clean PDF pagination.
```
### rules/subject-coverage.md
```md
# Curriculum Coverage Search Strategy
This file defines how to find and evaluate standard curriculum coverage for **any academic subject**. It replaces hardcoded topic lists — which only covered 8 CS/engineering subjects — with a live search approach that works for any course a student might take.
## When this is used
1. **Phase 0, Mode 3 (broad topic input):** User gives only a course name. Search for the standard syllabus before generating an outline.
2. **Phase 2, Step 4 (gap analysis):** Compare the actual course materials against what a standard course in this subject would cover.
---
## How to search for curriculum coverage
### Step 1: Identify the subject and level
From the user's input, determine:
- Subject name (e.g., "Organic Chemistry", "Econometrics", "Byzantine History")
- Level: introductory / intermediate / advanced (infer from context or ask)
- Region/system if relevant (e.g., UK A-level vs. US university)
### Step 2: Execute search queries (Mode A — web enabled)
Run 2-3 targeted searches. Prefer queries that surface syllabi, textbook tables of contents, and university course pages over generic explainers.
**Recommended query patterns:**
```
"[subject] university course syllabus topics"
"[subject] undergraduate curriculum [key textbook author OR university name]"
"[subject] course outline week by week"
"[subject] exam topics [university level]"
```
**Examples:**
- `"organic chemistry undergraduate syllabus topics"`
- `"econometrics course outline core topics"`
- `"Byzantine history university course curriculum"`
- `"linear algebra textbook Strang table of contents"`
**Target sources (in priority order):**
1. University course pages (MIT OpenCourseWare, Stanford, etc.) — authoritative syllabi
2. Canonical textbook tables of contents — stable, widely adopted
3. ACM/IEEE curriculum guidelines (for CS/engineering) — standardized
4. Wikipedia outlines ("Outline of [subject]") — broad coverage, good for unfamiliar fields
5. Major MOOC syllabi (Coursera, edX) — reflects current teaching practice
### Step 3: Extract the topic structure
From the search results, identify:
- **Core topics:** appear in multiple independent syllabi → these are universally expected
- **Common topics:** appear in most syllabi → likely covered but not guaranteed
- **Specialized topics:** appear in only some syllabi → may indicate course emphasis or advanced treatment
Discard topics that are:
- Specific to one institution's particular framing
- Clearly prerequisite knowledge not part of this course
- Graduate-level topics for an undergraduate course
### Step 4: Validate with standard references
If a textbook is clearly associated with this subject, check its table of contents as a sanity check. A topic in every major textbook is definitively "core." A topic in no major textbook is either niche or misidentified.
---
## Mode B fallback (no web access)
If web search is unavailable, use world knowledge. Apply this reasoning:
1. What would a student need to know to be considered "competent" in this field after one university course?
2. What concepts appear in every introductory treatment of this subject?
3. What does the course likely build toward? (What's the "final boss" concept of the curriculum?)
State explicitly: "No web search available — outline based on standard curriculum knowledge. Please verify against your actual course syllabus."
For well-established STEM subjects (mathematics, physics, chemistry, biology, standard engineering disciplines, core CS courses), world knowledge is reliable enough to generate a solid outline.
For humanities, social sciences, or niche technical subjects, add a stronger caveat and encourage the user to provide a reference syllabus or textbook.
---
## Phase 2 gap analysis application
After identifying the standard coverage for the subject, compare against Phase 1 extractions:
| Gap type | Action |
|---|---|
| Core topic entirely missing | High-priority Phase 3 target; flag prominently in synthesis |
| Core topic covered superficially | Medium-priority; note the gap and depth needed |
| Common topic missing | Low-priority; mention as "this course does not cover X, which standard curricula often include" |
| Specialized topic missing | Note only if student's Phase 0 emphasis indicated interest in this area |
**Always be specific about the gap:** don't just say "thermodynamics is missing" — say "the course covers reaction enthalpy (Lecture 3) but does not treat entropy or Gibbs free energy, which are core to understanding reaction spontaneity in standard physical chemistry curricula."
---
## Note on CS/engineering subjects
For the 8 most common CS/engineering courses (OS, Networks, DSA, Databases, Computer Architecture, ML/DL, Distributed Systems, Compilers), web search is still the preferred approach. World knowledge fallback is reliable for these subjects given their stable, well-documented curricula.
```
### rules/phase-expand.md
```md
# Phase 3: External Knowledge Expansion
## Goal
Expand the knowledge base **beyond the course boundary** with precisely sourced external knowledge. The result should make the student's understanding deeper and broader than the course alone provides.
## Step 0: Check operating mode
This phase operates in one of two modes, determined in Phase 0:
**Mode A — Web-enabled (WebSearch + WebFetch available)**
Run the full three-layer search strategy below.
**Mode B — Curriculum-grounded (no web access)**
Do NOT search the web. Instead, expand using stable, well-established knowledge from standard CS/engineering curricula. Every claim must be marked `[Standard curriculum knowledge]`. Do not invent URLs, paper titles, author names, or implementation details. If you are uncertain about a claim, omit it or flag it as `[Uncertain — verify before exam]`. The goal is to avoid hallucination causing review distortion, not to produce maximum volume.
If Mode B was chosen but the user now wants to enable web access, stop and ask them to grant WebSearch permission before continuing.
---
## Expansion targets (both modes)
Process expansion targets in this priority order. **Hard cap: maximum 15 targets per course.** If there are more, present a ranked list and ask the user to confirm which to include.
1. **[EXPAND] markers** from Phase 1 — concepts explicitly flagged as needing external investigation.
2. **Gaps and shallow topics** from Phase 2 — concepts the course covers superficially or misses.
3. **Core concepts** — central concepts that benefit from real-world grounding.
---
## Mode A: Web-enabled search strategy
For each target, use up to three layers. Stop when you have sufficient quality material.
### Layer 1: Industry & practical knowledge
Search for specific, deep content:
- Official documentation (Linux kernel docs, RFC, MDN, etc.)
- Engineering blogs from reputable companies (Google AI Blog, AWS Architecture, etc.)
- GitHub implementations (link to specific file/function, not repo root)
- Stack Overflow canonical answers for common confusions
Query tips: Use formal concept name + "implementation" / "in practice" / "architecture". Avoid generic "what is X" queries.
### Layer 2: Academic literature
Search for:
- The original paper introducing the concept
- Recent surveys
- Papers that extend or challenge the course's treatment
For each paper: title, authors, arXiv ID or DOI, year, and 1-3 sentences on its relevance.
### Layer 3: Cross-verification
When multiple sources are found, note where they agree (builds confidence) and where they disagree (flags nuance). If the course material contradicts an authoritative external source, flag this explicitly.
---
## Mode B: Curriculum-grounded expansion
For each target concept:
1. Describe the standard treatment of this concept in the field (textbook-level knowledge).
2. Identify what the course's treatment adds, simplifies, or omits compared to the standard.
3. Add the "bigger picture" context: where does this concept fit in the field? What problems does it address?
4. Note common misconceptions that standard curricula address.
All content marked `[Standard curriculum knowledge]`. No fictional sources.
---
## Output format
Organize by concept, not by search layer:
```markdown
# Course Expansion: [Course Name]
## [Concept Name]
**Source in course:** [Lecture X, pp. Y-Z | Section X.Y] — [brief note on course treatment]
**Why expand:** [EXPAND marker / gap from synthesis / core concept]
### What the course doesn't cover
[The specific gap this expansion addresses.]
### Expanded understanding
[Mode A: what industry or research adds. Mode B: standard curriculum context.]
**Sources:**
- [Mode A] [Page Title](URL) — one-line note
- [Mode B] [Standard curriculum knowledge — topic area]
### Key insight
[2-4 sentences: what the student gains here that the slides don't provide.]
### Cross-verification note *(Mode A only)*
[Agreement/disagreement across sources, if applicable.]
---
```
## Source citation rules
**Mode A:** Every factual claim needs a traceable source. Acceptable forms:
- Web: `[Title](full URL)`
- Paper: `Author et al., "Title" (Year). arXiv:ID` or `DOI:xxx`
- Docs: `[Doc Section](URL)` with version
- Fallback: `[Standard curriculum knowledge — topic area]` — use sparingly
**Never cite a source you haven't actually retrieved.** If search fails, write: "No high-quality external source found for [concept]; expansion based on course material and general domain knowledge."
**Mode B:** All claims `[Standard curriculum knowledge]`. No citations to sources not in hand.
## Depth calibration
| Concept | Mode A depth | Mode B depth |
|---|---|---|
| Core + [EXPAND] marked | All 3 layers, 300-500 words | 200-300 words, curriculum context |
| Core, well-covered | Layer 1 only, 100-200 words | 100 words, brief context |
| Gap from Phase 2 | Layers 1-2, 200-300 words | 150 words, what standard treatment adds |
| Minor/tangential | Skip or 1-2 sentences | Skip |
## Anti-patterns
- **DO NOT** dump raw search results without analysis.
- **DO NOT** expand more than 15 concepts — prioritize by value.
- **DO NOT** cite sources you haven't retrieved (Mode A) or invent sources (Mode B).
- **DO NOT** let this phase substitute for understanding the course — it enriches, it doesn't replace.
- **DO NOT** include outdated content without flagging it as historical.
- **DO NOT** (Mode B) state uncertain claims confidently. If unsure, omit or flag.
```
### rules/phase-intake.md
```md
# phase-intake — Phase 0: Intake Workflow
Run all steps below before starting any phase. **Keep this fast** — intake should take one exchange, not five.
---
## Step 1: Accept input materials
**Supported input: PDF only.** All course materials must be in PDF format.
- If the user has PPT/PPTX, DOCX, or images: use the `/pdf` skill to convert first, then return here.
- Do NOT attempt to read non-PDF files directly.
Ask the user in a single message:
> What do you have?
> 1. **PDF files** — upload them and we start Phase 1.
> 2. **A topic/concept list** — paste it and we skip to Phase 2.
> 3. **Just a course name** — tell me and I'll search for a standard syllabus first.
>
> Also:
> - How many pages total (rough estimate is fine)?
> - Preferred output language?
> - **Exam date or deadline?** (if any — affects depth and appendix generation)
> - **Any specific topics to prioritise?** (e.g. "focus on Ch3-5" or "skip the history sections")
---
## Step 2: Estimate scale and pick compression tier
Count total pages from the user's answer. Apply immediately — do not ask again:
| Total pages | Tier | Strategy |
|---|---|---|
| ≤ 60 | Light | Full extraction per page |
| 61–200 | Medium | Batch by lecture; summarize per lecture |
| 201–400 | Heavy | Batch + compress Phase 1 before Phase 2 |
| > 400 | Split | Warn user; recommend per-module runs |
---
## Step 3: Detect network and set mode
Detect WebSearch availability silently. Tell the user in one line:
- ✓ Web access available → Phase 3 will use live sources
- ✗ No web access → Phase 3 will use curriculum knowledge only (no hallucinated sources)
---
## Step 4: Determine output package in the same message
Ask in a single message (never two separate prompts):
> **Output package:**
> - **Standard** — comprehensive study notes (Phases 1–4)
> - **Exam Ready** — standard + Quick Reference Sheet + Exam Q&A Appendix
>
> **Output folder:** where to save files? (default: current directory)
**Exam Ready** is automatically recommended if the user mentioned an exam date or deadline.
---
## Step 5: Confirm and go
One-line summary, then start immediately on user confirmation. Do not ask further questions before Phase 1.
```
### rules/phase-synthesize.md
```md
# Phase 2: Cross-Lecture Deep Synthesis
## Goal
Build a **deep conceptual structure** of the entire course by connecting knowledge across all lectures/sections. This is NOT a simple merge — it requires applying world knowledge, revealing relationships the slides only imply, and identifying what the course doesn't teach.
## Source references
Throughout this phase, use the backbone reference format established in Phase 1:
- PDF input: `Lecture X, pp. Y-Z`
- Topic list / syllabus input: `Section X.Y`
Never reference source locations that don't exist in Phase 1 outputs.
## Procedure
### Step 1: Build the concept inventory
Read all `lecture-XX-extract.md` files using the Read tool (these are Markdown files, not PDFs — no `/pdf` skill needed here). For each concept, record:
- First appearance (source location)
- All subsequent mentions (source locations)
- Type: definition / theorem / algorithm / design pattern / tool
**For Medium/Heavy tier courses (> 60 pages):** work in batches of 3-4 lectures at a time, building a running inventory. Do not hold all lectures in context simultaneously.
### Step 2: Dependency analysis
For each concept, determine:
- **Hard prerequisites:** must be understood first (e.g., gradient descent before backpropagation)
- **Soft relations:** related but not prerequisite (e.g., TCP and UDP are peers)
- **Hierarchy:** is this an instance of a more general concept? (e.g., quicksort ⊂ divide-and-conquer)
Use world knowledge to infer dependencies the slides don't state explicitly. If a concept is used before it's defined, flag it as a gap.
### Step 3: Identify the course narrative
Every well-designed course tells a story. Identify:
- **Main thread:** the central question the course builds toward (e.g., "How do OS manage shared resources safely?")
- **Supporting threads:** recurring themes (e.g., "performance vs. safety tradeoffs", "layered abstraction")
- **Progression pattern:** linear? spiral? breadth-first then depth?
### Step 4: Fill conceptual gaps
Find the standard curriculum baseline for this subject using the search strategy in [subject-coverage.md](subject-coverage.md). Run 1-2 targeted searches (e.g., `"[subject] university course syllabus topics"`) to identify what a standard course in this field covers. This works for any subject — not just CS.
If both searches return low-quality results (no syllabi, no textbook tables of contents, no course pages), fall back to Mode B (world knowledge) and mark the gap analysis: `[Gap analysis based on curriculum knowledge — web results insufficient]`. Do not retry the same queries or invent sources.
Compare the actual course materials against this baseline. Identify:
- **Core topics missing from the course** → high-priority flag for Phase 3
- Topics covered **superficially** → medium-priority, candidates for Phase 3 expansion
- **Simplified models** that gloss over important nuance → note the gap
For each gap: "The slides present [X] as [simplified]. The full picture includes [missing piece]. This matters because [why]."
### Step 5: Produce the synthesis document
```markdown
# Course Synthesis: [Course Name]
## Course narrative
[2-3 paragraphs: what story this course tells, the central question it answers, how it builds]
## Knowledge architecture
[Organized by theme/module — NOT by lecture/section order]
### Module: [Theme Name]
**Core concepts:** [list]
**Dependency chain:** A → B → C (brief explanation of each link)
**Key insight:** [the "aha" that connects these concepts]
**Covered in:** [source locations]
[Repeat for each module]
## Cross-cutting patterns
[Tradeoffs, design principles, recurring proof techniques that appear across modules]
## Concept dependency graph
[Text representation: which concepts depend on which]
## Gaps and flags
[Topics that are shallow, missing, or oversimplified — with source locations and notes for Phase 3]
## Ambiguities and contradictions
[Inconsistencies between lectures, or places where the material is unclear]
```
## Depth expectations
For a 12-lecture course, synthesis should be substantial but not padded. Target: every module section has a dependency chain and a key insight. Skip the narrative fluff — the value is in the connections and gaps, not word count.
The test: could a student read this and understand the *architecture* of the subject — what depends on what, and what the course missed?
## Anti-patterns
- **DO NOT** concatenate lecture summaries — that's Phase 1, not synthesis.
- **DO NOT** limit yourself to what's literally in the slides — use world knowledge for the "why."
- **DO NOT** produce a flat list of concepts — the value is in the relationships.
- **DO NOT** skip gap analysis — identifying what the course doesn't teach is as important as what it does.
- **DO NOT** reference source locations that don't appear in Phase 1 outputs.
```