docx-generator
Create and manipulate Word DOCX files programmatically. Use when the user needs to generate documents, modify DOCX templates, extract document content, or automate Word document workflows. Supports both template-based generation (for branding compliance) and from-scratch creation. Keywords: Word, DOCX, document, report, template, contract, letter, corporate, branding.
Packaged view
This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.
Install command
npx @skill-hub/cli install jwynia-agent-skills-docx-generator
Repository
Skill path: skills/document-processing/word/docx-generator
Create and manipulate Word DOCX files programmatically. Use when the user needs to generate documents, modify DOCX templates, extract document content, or automate Word document workflows. Supports both template-based generation (for branding compliance) and from-scratch creation. Keywords: Word, DOCX, document, report, template, contract, letter, corporate, branding.
Open repositoryBest for
Primary workflow: Write Technical Docs.
Technical facets: Full Stack, Tech Writer.
Target audience: everyone.
License: MIT.
Original source
Catalog source: SkillHub Club.
Repository owner: jwynia.
This is still a mirrored public skill entry. Review the repository before installing into production workflows.
What it helps with
- Install docx-generator into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
- Review https://github.com/jwynia/agent-skills before adding docx-generator to shared team environments
- Use docx-generator for development workflows
Works across
Favorites: 0.
Sub-skills: 0.
Aggregator: No.
Original source / Raw SKILL.md
---
name: docx-generator
description: "Create and manipulate Word DOCX files programmatically. Use when the user needs to generate documents, modify DOCX templates, extract document content, or automate Word document workflows. Supports both template-based generation (for branding compliance) and from-scratch creation. Keywords: Word, DOCX, document, report, template, contract, letter, corporate, branding."
license: MIT
compatibility: Requires Deno with --allow-read, --allow-write permissions
metadata:
author: agent-skills
version: "1.0"
---
# DOCX Generator
## When to Use This Skill
Use this skill when:
- Creating Word documents programmatically from data or specifications
- Populating branded templates with dynamic content while preserving corporate styling
- Extracting text, tables, and structure from existing DOCX files for analysis
- Finding and replacing placeholder text like `{{TITLE}}` or `${author}`
- Automating document generation workflows (reports, contracts, letters)
Do NOT use this skill when:
- User wants to open/view documents (use native Word or viewer)
- Complex mail merge with data sources (use native Word mail merge)
- Working with older .doc format (DOCX only)
- PDF output is needed (use pdf-generator skill instead)
## Prerequisites
- Deno installed (https://deno.land/)
- Input DOCX files for template-based operations
- JSON specification for scratch generation
## Quick Start
### Two Modes of Operation
1. **Template Mode**: Modify existing branded templates
- Analyze template to find placeholders
- Replace `{{PLACEHOLDERS}}` with actual content
2. **Scratch Mode**: Create documents from nothing using JSON specifications
## Instructions
### Mode 1: Template-Based Generation
#### Step 1a: Analyze the Template
Extract text inventory to understand what can be replaced:
```bash
deno run --allow-read scripts/analyze-template.ts corporate-template.docx > inventory.json
```
**Output** (inventory.json):
```json
{
"filename": "corporate-template.docx",
"paragraphCount": 25,
"tableCount": 2,
"imageCount": 1,
"paragraphs": [
{
"index": 0,
"style": "Title",
"fullText": "{{DOCUMENT_TITLE}}",
"runs": [
{ "text": "{{DOCUMENT_TITLE}}", "bold": true, "fontSize": 28 }
]
}
],
"placeholders": [
{ "tag": "{{DOCUMENT_TITLE}}", "location": "paragraph", "paragraphIndex": 0 },
{ "tag": "{{AUTHOR}}", "location": "footer-default", "paragraphIndex": 0 },
{ "tag": "${date}", "location": "table", "tableIndex": 0, "cellLocation": "R1C2" }
]
}
```
#### Step 1b: Create Replacement Specification
Create `replacements.json`:
```json
{
"textReplacements": [
{ "tag": "{{DOCUMENT_TITLE}}", "value": "Q4 2024 Financial Report" },
{ "tag": "{{AUTHOR}}", "value": "Finance Department" },
{ "tag": "${date}", "value": "December 15, 2024" },
{ "tag": "{{COMPANY}}", "value": "Acme Corporation" }
],
"includeHeaders": true,
"includeFooters": true
}
```
#### Step 1c: Generate Output
```bash
deno run --allow-read --allow-write scripts/generate-from-template.ts \
corporate-template.docx replacements.json output.docx
```
### Mode 2: From-Scratch Generation
#### Step 2a: Create Specification
Create `spec.json`:
```json
{
"title": "Quarterly Report",
"creator": "Finance Team",
"styles": {
"defaultFont": "Calibri",
"defaultFontSize": 11
},
"sections": [
{
"header": {
"paragraphs": [
{ "text": "Acme Corporation", "alignment": "right" }
]
},
"footer": {
"paragraphs": [
{ "text": "Confidential", "alignment": "center" }
]
},
"content": [
{
"text": "Q4 2024 Financial Report",
"heading": 1,
"alignment": "center"
},
{
"runs": [
{ "text": "Executive Summary: ", "bold": true },
{ "text": "This report provides an overview of our financial performance for Q4 2024." }
]
},
{ "pageBreak": true },
{
"text": "Revenue Breakdown",
"heading": 2
},
{
"rows": [
{
"cells": [
{ "content": [{ "text": "Category" }], "shading": "DDDDDD" },
{ "content": [{ "text": "Amount" }], "shading": "DDDDDD" },
{ "content": [{ "text": "Change" }], "shading": "DDDDDD" }
],
"isHeader": true
},
{
"cells": [
{ "content": [{ "text": "Product Sales" }] },
{ "content": [{ "text": "$1,250,000" }] },
{ "content": [{ "text": "+15%" }] }
]
},
{
"cells": [
{ "content": [{ "text": "Services" }] },
{ "content": [{ "text": "$750,000" }] },
{ "content": [{ "text": "+8%" }] }
]
}
],
"width": 100,
"borders": true
}
]
}
]
}
```
#### Step 2b: Generate Document
```bash
deno run --allow-read --allow-write scripts/generate-scratch.ts spec.json output.docx
```
## Examples
### Example 1: Contract Generation
**Scenario**: Generate contracts from a branded template.
**Steps**:
```bash
# 1. Analyze template for replaceable content
deno run --allow-read scripts/analyze-template.ts contract-template.docx --pretty
# 2. Create replacements.json with client data
# 3. Generate contract
deno run --allow-read --allow-write scripts/generate-from-template.ts \
contract-template.docx replacements.json acme-contract.docx
```
### Example 2: Report with Tables
**Scenario**: Generate a data report with tables and formatting.
**spec.json**:
```json
{
"title": "Sales Report",
"sections": [{
"content": [
{ "text": "Monthly Sales Report", "heading": 1 },
{ "text": "January 2025", "heading": 2 },
{
"runs": [
{ "text": "Total Sales: ", "bold": true },
{ "text": "$125,000", "color": "2E7D32" }
]
}
]
}]
}
```
### Example 3: Letter with Headers/Footers
**Scenario**: Create a formal letter with letterhead.
**spec.json**:
```json
{
"sections": [{
"header": {
"paragraphs": [
{ "text": "ACME CORPORATION", "alignment": "center", "runs": [{"text": "ACME CORPORATION", "bold": true, "fontSize": 16}] },
{ "text": "123 Business Ave, City, ST 12345", "alignment": "center" }
]
},
"content": [
{ "text": "December 15, 2024", "alignment": "right" },
{ "text": "" },
{ "text": "Dear Valued Customer," },
{ "text": "" },
{ "text": "Thank you for your continued business..." },
{ "text": "" },
{ "text": "Sincerely," },
{ "text": "John Smith" },
{ "runs": [{ "text": "CEO", "italic": true }] }
],
"footer": {
"paragraphs": [
{ "text": "www.acme.com | [email protected]", "alignment": "center" }
]
}
}]
}
```
## Script Reference
| Script | Purpose | Permissions |
|--------|---------|-------------|
| `analyze-template.ts` | Extract text, tables, placeholders from DOCX | `--allow-read` |
| `generate-from-template.ts` | Replace placeholders in templates | `--allow-read --allow-write` |
| `generate-scratch.ts` | Create DOCX from JSON specification | `--allow-read --allow-write` |
## Specification Reference
### Paragraph Options
| Property | Type | Description |
|----------|------|-------------|
| `text` | string | Simple text content |
| `runs` | array | Formatted text runs (for mixed formatting) |
| `heading` | 1-6 | Heading level |
| `alignment` | string | `left`, `center`, `right`, `justify` |
| `bullet` | boolean | Bulleted list item |
| `numbering` | boolean | Numbered list item |
| `spacing` | object | `before`, `after`, `line` spacing |
| `indent` | object | `left`, `right`, `firstLine` indentation |
| `pageBreakBefore` | boolean | Insert page break before paragraph |
### Text Run Options
| Property | Type | Description |
|----------|------|-------------|
| `text` | string | Text content |
| `bold` | boolean | Bold formatting |
| `italic` | boolean | Italic formatting |
| `underline` | boolean | Underline formatting |
| `strike` | boolean | Strikethrough |
| `fontSize` | number | Font size in points |
| `font` | string | Font family name |
| `color` | string | Text color (hex, no #) |
| `highlight` | string | Highlight color |
| `superScript` | boolean | Superscript |
| `subScript` | boolean | Subscript |
### Table Options
| Property | Type | Description |
|----------|------|-------------|
| `rows` | array | Array of row specifications |
| `width` | number | Table width as percentage |
| `borders` | boolean | Show table borders |
### Hyperlink Options
| Property | Type | Description |
|----------|------|-------------|
| `text` | string | Link text |
| `url` | string | Target URL |
| `bold` | boolean | Bold formatting |
| `italic` | boolean | Italic formatting |
## Common Issues and Solutions
### Issue: Placeholders not being replaced
**Symptoms**: Output DOCX still contains `{{PLACEHOLDER}}` tags.
**Solution**:
1. Run `analyze-template.ts` to verify exact tag text
2. Tags may be split across XML runs - the script consolidates these automatically
3. Ensure `includeHeaders` and `includeFooters` are true if placeholders are there
### Issue: Formatting lost after replacement
**Symptoms**: Replaced text doesn't match original formatting.
**Solution**:
- Text replacement preserves the formatting of the original placeholder
- Ensure placeholder is formatted the way you want the final text to appear
### Issue: Images not appearing
**Symptoms**: Image elements are blank in output.
**Solution**:
1. Use paths relative to the spec.json file location
2. Verify image file exists and is readable
3. Check supported formats: PNG, JPEG, GIF
### Issue: Table cell content incorrect
**Symptoms**: Table cells have wrong content or formatting.
**Solution**:
- Each cell's `content` must be an array of paragraph specifications
- Use `shading` for background color, `verticalAlign` for alignment
## Limitations
- **DOCX only**: Does not support legacy .doc format
- **No track changes**: Cannot add or process track changes
- **No comments**: Cannot add document comments
- **No macros**: Cannot include VBA macros
- **Basic numbering**: Limited support for complex numbering schemes
- **Text run splitting**: Word may split text across XML elements; script handles common cases
## Related Skills
- **pptx-generator**: For creating PowerPoint presentations
- **xlsx-generator**: For creating Excel spreadsheets
- **pdf-generator**: For creating PDF documents
---
## Referenced Files
> The following files are referenced in this skill and included for context.
### scripts/analyze-template.ts
```typescript
#!/usr/bin/env -S deno run --allow-read
/**
* analyze-template.ts - Extract text content and structure from DOCX files
*
* Extracts paragraphs, tables, headers, footers, and placeholders from Word documents
* for template analysis and content replacement planning.
*
* Usage:
* deno run --allow-read scripts/analyze-template.ts <input.docx> [options]
*
* Options:
* -h, --help Show help
* -v, --verbose Enable verbose output
* --pretty Pretty-print JSON output
*
* Permissions:
* --allow-read: Read DOCX file
*/
import { parseArgs } from "jsr:@std/[email protected]/parse-args";
import { basename } from "jsr:@std/[email protected]";
import JSZip from "npm:[email protected]";
import { DOMParser } from "npm:@xmldom/[email protected]";
// === Types ===
export interface TextRun {
text: string;
bold?: boolean;
italic?: boolean;
underline?: boolean;
fontSize?: number;
fontFamily?: string;
color?: string;
}
export interface ParagraphInfo {
index: number;
style?: string;
alignment?: "left" | "center" | "right" | "justify";
runs: TextRun[];
fullText: string;
isListItem: boolean;
listLevel?: number;
}
export interface TableCellInfo {
row: number;
col: number;
text: string;
paragraphs: ParagraphInfo[];
}
export interface TableInfo {
index: number;
rows: number;
cols: number;
cells: TableCellInfo[];
}
export interface HeaderFooterInfo {
type: "header" | "footer";
section: "default" | "first" | "even";
paragraphs: ParagraphInfo[];
}
export interface PlaceholderInfo {
tag: string;
location: string;
paragraphIndex?: number;
tableIndex?: number;
cellLocation?: string;
}
export interface DocumentInventory {
filename: string;
paragraphCount: number;
tableCount: number;
imageCount: number;
paragraphs: ParagraphInfo[];
tables: TableInfo[];
headersFooters: HeaderFooterInfo[];
placeholders: PlaceholderInfo[];
styles: string[];
}
interface ParsedArgs {
help: boolean;
verbose: boolean;
pretty: boolean;
_: (string | number)[];
}
// === Constants ===
const VERSION = "1.0.0";
const SCRIPT_NAME = "analyze-template";
const NS = {
w: "http://schemas.openxmlformats.org/wordprocessingml/2006/main",
r: "http://schemas.openxmlformats.org/officeDocument/2006/relationships",
};
// Placeholder pattern: {{PLACEHOLDER}} or ${placeholder}
const PLACEHOLDER_REGEX = /\{\{([^}]+)\}\}|\$\{([^}]+)\}/g;
// === Help Text ===
function printHelp(): void {
console.log(`
${SCRIPT_NAME} v${VERSION} - Extract text content from DOCX templates
Usage:
deno run --allow-read scripts/${SCRIPT_NAME}.ts <input.docx> [options]
Arguments:
<input.docx> Path to the Word document to analyze
Options:
-h, --help Show this help message
-v, --verbose Enable verbose output (to stderr)
--pretty Pretty-print JSON output (default: compact)
Examples:
# Analyze document
deno run --allow-read scripts/${SCRIPT_NAME}.ts template.docx > inventory.json
# With verbose output
deno run --allow-read scripts/${SCRIPT_NAME}.ts template.docx -v --pretty
`);
}
// === Utility Functions ===
// deno-lint-ignore no-explicit-any
function getElementsByTagNameNS(parent: any, ns: string, localName: string): any[] {
const elements = parent.getElementsByTagNameNS(ns, localName);
// deno-lint-ignore no-explicit-any
return Array.from(elements) as any[];
}
// deno-lint-ignore no-explicit-any
function getTextContent(element: any): string {
const textElements = getElementsByTagNameNS(element, NS.w, "t");
return textElements.map((t) => t.textContent || "").join("");
}
// === Parsing Functions ===
// deno-lint-ignore no-explicit-any
function parseTextRun(rElement: any): TextRun {
const run: TextRun = {
text: getTextContent(rElement),
};
const rPr = getElementsByTagNameNS(rElement, NS.w, "rPr")[0];
if (rPr) {
// Bold
if (getElementsByTagNameNS(rPr, NS.w, "b").length > 0) {
run.bold = true;
}
// Italic
if (getElementsByTagNameNS(rPr, NS.w, "i").length > 0) {
run.italic = true;
}
// Underline
if (getElementsByTagNameNS(rPr, NS.w, "u").length > 0) {
run.underline = true;
}
// Font size
const sz = getElementsByTagNameNS(rPr, NS.w, "sz")[0];
if (sz) {
const val = sz.getAttribute("w:val");
if (val) run.fontSize = parseInt(val, 10) / 2; // Half-points to points
}
// Font family
const rFonts = getElementsByTagNameNS(rPr, NS.w, "rFonts")[0];
if (rFonts) {
run.fontFamily = rFonts.getAttribute("w:ascii") || rFonts.getAttribute("w:hAnsi");
}
// Color
const color = getElementsByTagNameNS(rPr, NS.w, "color")[0];
if (color) {
run.color = color.getAttribute("w:val");
}
}
return run;
}
// deno-lint-ignore no-explicit-any
function parseParagraph(pElement: any, index: number): ParagraphInfo {
const runs = getElementsByTagNameNS(pElement, NS.w, "r").map(parseTextRun);
const fullText = runs.map((r) => r.text).join("");
const para: ParagraphInfo = {
index,
runs,
fullText,
isListItem: false,
};
const pPr = getElementsByTagNameNS(pElement, NS.w, "pPr")[0];
if (pPr) {
// Style
const pStyle = getElementsByTagNameNS(pPr, NS.w, "pStyle")[0];
if (pStyle) {
para.style = pStyle.getAttribute("w:val");
}
// Alignment
const jc = getElementsByTagNameNS(pPr, NS.w, "jc")[0];
if (jc) {
const val = jc.getAttribute("w:val");
if (val === "left" || val === "center" || val === "right" || val === "both") {
para.alignment = val === "both" ? "justify" : val;
}
}
// List item
const numPr = getElementsByTagNameNS(pPr, NS.w, "numPr")[0];
if (numPr) {
para.isListItem = true;
const ilvl = getElementsByTagNameNS(numPr, NS.w, "ilvl")[0];
if (ilvl) {
para.listLevel = parseInt(ilvl.getAttribute("w:val") || "0", 10);
}
}
}
return para;
}
// deno-lint-ignore no-explicit-any
function parseTable(tblElement: any, index: number): TableInfo {
const rows = getElementsByTagNameNS(tblElement, NS.w, "tr");
const cells: TableCellInfo[] = [];
let maxCols = 0;
rows.forEach((row, rowIndex) => {
const tcs = getElementsByTagNameNS(row, NS.w, "tc");
maxCols = Math.max(maxCols, tcs.length);
tcs.forEach((tc, colIndex) => {
const paragraphs = getElementsByTagNameNS(tc, NS.w, "p").map((p, i) =>
parseParagraph(p, i)
);
cells.push({
row: rowIndex,
col: colIndex,
text: paragraphs.map((p) => p.fullText).join("\n"),
paragraphs,
});
});
});
return {
index,
rows: rows.length,
cols: maxCols,
cells,
};
}
function findPlaceholders(
text: string,
location: string,
paragraphIndex?: number,
tableIndex?: number,
cellLocation?: string
): PlaceholderInfo[] {
const placeholders: PlaceholderInfo[] = [];
let match;
while ((match = PLACEHOLDER_REGEX.exec(text)) !== null) {
placeholders.push({
tag: match[0],
location,
paragraphIndex,
tableIndex,
cellLocation,
});
}
return placeholders;
}
// deno-lint-ignore no-explicit-any
async function parseDocumentPart(
zip: JSZip,
partPath: string,
type: "header" | "footer",
section: "default" | "first" | "even"
): Promise<HeaderFooterInfo | null> {
const file = zip.file(partPath);
if (!file) return null;
const xml = await file.async("string");
const parser = new DOMParser();
const doc = parser.parseFromString(xml, "text/xml");
const paragraphs = getElementsByTagNameNS(doc, NS.w, "p").map((p, i) =>
parseParagraph(p, i)
);
return {
type,
section,
paragraphs,
};
}
// === Core Logic ===
export async function analyzeDocument(
docxPath: string,
options: { verbose?: boolean } = {}
): Promise<DocumentInventory> {
const { verbose = false } = options;
// Read the DOCX file
const data = await Deno.readFile(docxPath);
const zip = await JSZip.loadAsync(data);
const filename = basename(docxPath);
// Parse main document
const documentFile = zip.file("word/document.xml");
if (!documentFile) {
throw new Error("Invalid DOCX: missing word/document.xml");
}
const documentXml = await documentFile.async("string");
const parser = new DOMParser();
const doc = parser.parseFromString(documentXml, "text/xml");
// Parse paragraphs
const bodyParagraphs = getElementsByTagNameNS(doc, NS.w, "p");
const paragraphs: ParagraphInfo[] = [];
const allPlaceholders: PlaceholderInfo[] = [];
let paragraphIndex = 0;
for (const p of bodyParagraphs) {
// Skip paragraphs inside tables (they're handled separately)
// deno-lint-ignore no-explicit-any
let parent = (p as any).parentNode;
let inTable = false;
while (parent) {
if (parent.localName === "tc") {
inTable = true;
break;
}
parent = parent.parentNode;
}
if (!inTable) {
const para = parseParagraph(p, paragraphIndex);
paragraphs.push(para);
// Find placeholders
const placeholders = findPlaceholders(
para.fullText,
"paragraph",
paragraphIndex
);
allPlaceholders.push(...placeholders);
paragraphIndex++;
}
}
if (verbose) {
console.error(`Found ${paragraphs.length} paragraphs`);
}
// Parse tables
const tableElements = getElementsByTagNameNS(doc, NS.w, "tbl");
const tables = tableElements.map((tbl, i) => {
const table = parseTable(tbl, i);
// Find placeholders in table cells
for (const cell of table.cells) {
const placeholders = findPlaceholders(
cell.text,
"table",
undefined,
i,
`R${cell.row + 1}C${cell.col + 1}`
);
allPlaceholders.push(...placeholders);
}
return table;
});
if (verbose) {
console.error(`Found ${tables.length} tables`);
}
// Count images
let imageCount = 0;
zip.forEach((path) => {
if (path.startsWith("word/media/")) {
imageCount++;
}
});
if (verbose) {
console.error(`Found ${imageCount} images`);
}
// Parse headers and footers
const headersFooters: HeaderFooterInfo[] = [];
const headerFooterFiles = [
{ path: "word/header1.xml", type: "header" as const, section: "default" as const },
{ path: "word/header2.xml", type: "header" as const, section: "first" as const },
{ path: "word/header3.xml", type: "header" as const, section: "even" as const },
{ path: "word/footer1.xml", type: "footer" as const, section: "default" as const },
{ path: "word/footer2.xml", type: "footer" as const, section: "first" as const },
{ path: "word/footer3.xml", type: "footer" as const, section: "even" as const },
];
for (const { path, type, section } of headerFooterFiles) {
const hf = await parseDocumentPart(zip, path, type, section);
if (hf && hf.paragraphs.length > 0) {
headersFooters.push(hf);
// Find placeholders in headers/footers
for (const para of hf.paragraphs) {
const placeholders = findPlaceholders(
para.fullText,
`${type}-${section}`,
para.index
);
allPlaceholders.push(...placeholders);
}
}
}
// Collect unique styles
const styles = new Set<string>();
for (const para of paragraphs) {
if (para.style) styles.add(para.style);
}
if (verbose) {
console.error(`Found ${allPlaceholders.length} placeholders`);
}
return {
filename,
paragraphCount: paragraphs.length,
tableCount: tables.length,
imageCount,
paragraphs,
tables,
headersFooters,
placeholders: allPlaceholders,
styles: Array.from(styles),
};
}
// === Main CLI Handler ===
async function main(args: string[]): Promise<void> {
const parsed = parseArgs(args, {
boolean: ["help", "verbose", "pretty"],
alias: { help: "h", verbose: "v" },
default: { verbose: false, pretty: false },
}) as ParsedArgs;
if (parsed.help) {
printHelp();
Deno.exit(0);
}
const positionalArgs = parsed._.map(String);
if (positionalArgs.length === 0) {
console.error("Error: No input file provided\n");
printHelp();
Deno.exit(1);
}
const inputPath = positionalArgs[0];
try {
const inventory = await analyzeDocument(inputPath, {
verbose: parsed.verbose,
});
const output = parsed.pretty
? JSON.stringify(inventory, null, 2)
: JSON.stringify(inventory);
console.log(output);
} catch (error) {
console.error(
"Error:",
error instanceof Error ? error.message : String(error)
);
Deno.exit(1);
}
}
// === Entry Point ===
if (import.meta.main) {
main(Deno.args);
}
```
### scripts/generate-from-template.ts
```typescript
#!/usr/bin/env -S deno run --allow-read --allow-write
/**
* generate-from-template.ts - Generate DOCX from existing templates
*
* Modifies existing Word templates using placeholder replacement.
* Finds and replaces tagged content (e.g., {{TITLE}}, ${author}) in
* paragraphs, tables, headers, and footers.
*
* Usage:
* deno run --allow-read --allow-write scripts/generate-from-template.ts <template.docx> <spec.json> <output.docx>
*
* Options:
* -h, --help Show help
* -v, --verbose Enable verbose output
*
* Permissions:
* --allow-read: Read template and specification files
* --allow-write: Write output DOCX file
*/
import { parseArgs } from "jsr:@std/[email protected]/parse-args";
import { basename, dirname, resolve } from "jsr:@std/[email protected]";
import JSZip from "npm:[email protected]";
import { DOMParser, XMLSerializer } from "npm:@xmldom/[email protected]";
// === Types ===
export interface TextReplacement {
/** The tag to find and replace (e.g., "{{TITLE}}" or "${author}") */
tag: string;
/** The replacement text */
value: string;
}
export interface ImageReplacement {
/** The placeholder tag for the image (e.g., "{{LOGO}}") */
tag: string;
/** Path to the image file (relative to spec file) */
imagePath: string;
/** Width in pixels */
width: number;
/** Height in pixels */
height: number;
}
export interface TemplateSpec {
/** Text replacements to apply */
textReplacements?: TextReplacement[];
/** Image replacements (replaces placeholder text with image) */
imageReplacements?: ImageReplacement[];
/** Whether to apply replacements in headers */
includeHeaders?: boolean;
/** Whether to apply replacements in footers */
includeFooters?: boolean;
}
interface ParsedArgs {
help: boolean;
verbose: boolean;
_: (string | number)[];
}
// === Constants ===
const VERSION = "1.0.0";
const SCRIPT_NAME = "generate-from-template";
// XML Namespaces
const NS = {
w: "http://schemas.openxmlformats.org/wordprocessingml/2006/main",
r: "http://schemas.openxmlformats.org/officeDocument/2006/relationships",
rel: "http://schemas.openxmlformats.org/package/2006/relationships",
wp: "http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing",
a: "http://schemas.openxmlformats.org/drawingml/2006/main",
pic: "http://schemas.openxmlformats.org/drawingml/2006/picture",
};
// Placeholder patterns: {{PLACEHOLDER}} or ${placeholder}
const PLACEHOLDER_PATTERNS = [
/\{\{([^}]+)\}\}/g, // {{TAG}}
/\$\{([^}]+)\}/g, // ${tag}
];
// === Help Text ===
function printHelp(): void {
console.log(`
${SCRIPT_NAME} v${VERSION} - Generate DOCX from existing templates
Usage:
deno run --allow-read --allow-write scripts/${SCRIPT_NAME}.ts <template.docx> <spec.json> <output.docx>
Arguments:
<template.docx> Path to the template Word document
<spec.json> Path to JSON specification for replacements
<output.docx> Path for output Word document
Options:
-h, --help Show this help message
-v, --verbose Enable verbose output
Specification Format:
{
"textReplacements": [
{ "tag": "{{TITLE}}", "value": "Quarterly Report" },
{ "tag": "{{DATE}}", "value": "December 2024" },
{ "tag": "\${author}", "value": "John Smith" }
],
"includeHeaders": true,
"includeFooters": true
}
Examples:
# Replace text in template
deno run --allow-read --allow-write scripts/${SCRIPT_NAME}.ts \\
template.docx replacements.json output.docx
# With verbose output
deno run --allow-read --allow-write scripts/${SCRIPT_NAME}.ts \\
template.docx replacements.json output.docx -v
`);
}
// === Utility Functions ===
// deno-lint-ignore no-explicit-any
function getElementsByTagNameNS(parent: any, ns: string, localName: string): any[] {
const elements = parent.getElementsByTagNameNS(ns, localName);
// deno-lint-ignore no-explicit-any
return Array.from(elements) as any[];
}
function escapeRegExp(string: string): string {
return string.replace(/[.*+?^${}()|[\]\\]/g, "\\$&");
}
// === Text Replacement ===
interface ReplacementResult {
modified: boolean;
replacementCount: number;
}
function replaceTextInXml(
xmlContent: string,
replacements: TextReplacement[],
verbose: boolean = false
): { content: string; result: ReplacementResult } {
const parser = new DOMParser();
const doc = parser.parseFromString(xmlContent, "text/xml");
let totalReplacements = 0;
let modified = false;
// Find all text elements
const textElements = getElementsByTagNameNS(doc, NS.w, "t");
// deno-lint-ignore no-explicit-any
for (const textEl of textElements as any[]) {
let text = textEl.textContent || "";
const originalText = text;
for (const replacement of replacements) {
// The tag should match exactly as provided
const tag = replacement.tag;
if (text.includes(tag)) {
const regex = new RegExp(escapeRegExp(tag), "g");
const matches = text.match(regex);
if (matches) {
totalReplacements += matches.length;
}
text = text.replace(regex, replacement.value);
}
}
if (text !== originalText) {
textEl.textContent = text;
modified = true;
}
}
const serializer = new XMLSerializer();
return {
content: serializer.serializeToString(doc),
result: { modified, replacementCount: totalReplacements },
};
}
/**
* Handle split placeholders across multiple text runs.
* Word often splits text like "{{TITLE}}" into multiple runs like "<w:t>{{</w:t><w:t>TITLE</w:t><w:t>}}</w:t>"
* This function consolidates runs within a paragraph before replacement.
*/
function consolidateTextRuns(xmlContent: string): string {
const parser = new DOMParser();
const doc = parser.parseFromString(xmlContent, "text/xml");
// Process each paragraph
const paragraphs = getElementsByTagNameNS(doc, NS.w, "p");
// deno-lint-ignore no-explicit-any
for (const para of paragraphs as any[]) {
const runs = getElementsByTagNameNS(para, NS.w, "r");
// Collect all text content and check for split placeholders
let fullText = "";
// deno-lint-ignore no-explicit-any
const textNodes: any[] = [];
// deno-lint-ignore no-explicit-any
for (const run of runs as any[]) {
const tElements = getElementsByTagNameNS(run, NS.w, "t");
for (const t of tElements) {
fullText += t.textContent || "";
textNodes.push(t);
}
}
// Check if there are placeholders in the combined text that might be split
const hasSplitPlaceholder = (
(fullText.includes("{{") && fullText.includes("}}")) ||
(fullText.includes("${") && fullText.includes("}"))
);
// Only consolidate if we have split placeholders and multiple text nodes
if (hasSplitPlaceholder && textNodes.length > 1) {
// Put all text in the first node and clear others
if (textNodes.length > 0) {
textNodes[0].textContent = fullText;
for (let i = 1; i < textNodes.length; i++) {
textNodes[i].textContent = "";
}
}
}
}
const serializer = new XMLSerializer();
return serializer.serializeToString(doc);
}
// === Core Logic ===
export async function generateFromTemplate(
templatePath: string,
spec: TemplateSpec,
outputPath: string,
options: { verbose?: boolean; specDir?: string } = {}
): Promise<void> {
const { verbose = false } = options;
// Read template
const templateData = await Deno.readFile(templatePath);
const zip = await JSZip.loadAsync(templateData);
if (verbose) {
console.error(`Loaded template: ${basename(templatePath)}`);
}
const replacements = spec.textReplacements || [];
const includeHeaders = spec.includeHeaders !== false;
const includeFooters = spec.includeFooters !== false;
if (verbose) {
console.error(`Processing ${replacements.length} text replacements`);
console.error(`Include headers: ${includeHeaders}, Include footers: ${includeFooters}`);
}
let totalReplacements = 0;
// Process main document
const documentFile = zip.file("word/document.xml");
if (documentFile) {
let documentXml = await documentFile.async("string");
// Consolidate split text runs
documentXml = consolidateTextRuns(documentXml);
// Apply replacements
const { content, result } = replaceTextInXml(documentXml, replacements, verbose);
zip.file("word/document.xml", content);
totalReplacements += result.replacementCount;
if (verbose && result.modified) {
console.error(`Document: ${result.replacementCount} replacements`);
}
}
// Process headers
if (includeHeaders) {
const headerFiles = ["word/header1.xml", "word/header2.xml", "word/header3.xml"];
for (const headerPath of headerFiles) {
const headerFile = zip.file(headerPath);
if (headerFile) {
let headerXml = await headerFile.async("string");
headerXml = consolidateTextRuns(headerXml);
const { content, result } = replaceTextInXml(headerXml, replacements, verbose);
zip.file(headerPath, content);
totalReplacements += result.replacementCount;
if (verbose && result.modified) {
console.error(`${headerPath}: ${result.replacementCount} replacements`);
}
}
}
}
// Process footers
if (includeFooters) {
const footerFiles = ["word/footer1.xml", "word/footer2.xml", "word/footer3.xml"];
for (const footerPath of footerFiles) {
const footerFile = zip.file(footerPath);
if (footerFile) {
let footerXml = await footerFile.async("string");
footerXml = consolidateTextRuns(footerXml);
const { content, result } = replaceTextInXml(footerXml, replacements, verbose);
zip.file(footerPath, content);
totalReplacements += result.replacementCount;
if (verbose && result.modified) {
console.error(`${footerPath}: ${result.replacementCount} replacements`);
}
}
}
}
if (verbose) {
console.error(`Total replacements: ${totalReplacements}`);
}
// Write output file
const outputData = await zip.generateAsync({
type: "uint8array",
compression: "DEFLATE",
compressionOptions: { level: 6 },
});
await Deno.writeFile(outputPath, outputData);
if (verbose) {
console.error(`Wrote ${outputPath}`);
}
}
// === Main CLI Handler ===
async function main(args: string[]): Promise<void> {
const parsed = parseArgs(args, {
boolean: ["help", "verbose"],
alias: { help: "h", verbose: "v" },
default: { verbose: false },
}) as ParsedArgs;
if (parsed.help) {
printHelp();
Deno.exit(0);
}
const positionalArgs = parsed._.map(String);
if (positionalArgs.length < 3) {
console.error(
"Error: template.docx, spec.json, and output.docx are required\n"
);
printHelp();
Deno.exit(1);
}
const templatePath = positionalArgs[0];
const specPath = positionalArgs[1];
const outputPath = positionalArgs[2];
try {
// Read specification
const specText = await Deno.readTextFile(specPath);
const spec = JSON.parse(specText) as TemplateSpec;
const specDir = dirname(resolve(specPath));
await generateFromTemplate(templatePath, spec, outputPath, {
verbose: parsed.verbose,
specDir,
});
console.log(`Created: ${outputPath}`);
} catch (error) {
console.error(
"Error:",
error instanceof Error ? error.message : String(error)
);
Deno.exit(1);
}
}
// === Entry Point ===
if (import.meta.main) {
main(Deno.args);
}
```
### scripts/generate-scratch.ts
```typescript
#!/usr/bin/env -S deno run --allow-read --allow-write
/**
* generate-scratch.ts - Create DOCX from scratch using JSON specification
*
* Creates Word documents programmatically from a JSON specification
* using the docx library. Supports paragraphs, tables, images, headers, and footers.
*
* Usage:
* deno run --allow-read --allow-write scripts/generate-scratch.ts <spec.json> <output.docx>
*
* Options:
* -h, --help Show help
* -v, --verbose Enable verbose output
*
* Permissions:
* --allow-read: Read specification file and image assets
* --allow-write: Write output DOCX file
*/
import { parseArgs } from "jsr:@std/[email protected]/parse-args";
import { dirname, resolve } from "jsr:@std/[email protected]";
import {
Document,
Packer,
Paragraph,
TextRun,
Table,
TableRow,
TableCell,
WidthType,
AlignmentType,
HeadingLevel,
BorderStyle,
Header,
Footer,
PageBreak,
ImageRun,
ExternalHyperlink,
// deno-lint-ignore no-explicit-any
} from "npm:[email protected]" as any;
// === Types ===
export interface TextRunSpec {
text: string;
bold?: boolean;
italic?: boolean;
underline?: boolean;
strike?: boolean;
fontSize?: number; // points
font?: string;
color?: string; // hex without #
highlight?: string;
superScript?: boolean;
subScript?: boolean;
}
export interface HyperlinkSpec {
text: string;
url: string;
bold?: boolean;
italic?: boolean;
}
export interface ImageSpec {
path: string;
width: number; // pixels
height: number; // pixels
altText?: string;
}
export interface ParagraphSpec {
text?: string;
runs?: (TextRunSpec | HyperlinkSpec | ImageSpec)[];
heading?: 1 | 2 | 3 | 4 | 5 | 6;
alignment?: "left" | "center" | "right" | "justify";
bullet?: boolean;
numbering?: boolean;
spacing?: {
before?: number;
after?: number;
line?: number;
};
indent?: {
left?: number;
right?: number;
firstLine?: number;
};
pageBreakBefore?: boolean;
}
export interface TableCellSpec {
content: ParagraphSpec[];
width?: number; // percentage or points
rowSpan?: number;
colSpan?: number;
shading?: string; // hex color
verticalAlign?: "top" | "center" | "bottom";
}
export interface TableRowSpec {
cells: TableCellSpec[];
isHeader?: boolean;
}
export interface TableSpec {
rows: TableRowSpec[];
width?: number; // percentage
borders?: boolean;
}
export interface HeaderFooterSpec {
paragraphs: ParagraphSpec[];
}
export interface SectionSpec {
content: (ParagraphSpec | TableSpec | { pageBreak: true })[];
header?: HeaderFooterSpec;
footer?: HeaderFooterSpec;
}
export interface DocumentSpec {
title?: string;
subject?: string;
creator?: string;
description?: string;
styles?: {
defaultFont?: string;
defaultFontSize?: number;
};
sections: SectionSpec[];
}
interface ParsedArgs {
help: boolean;
verbose: boolean;
_: (string | number)[];
}
// === Constants ===
const VERSION = "1.0.0";
const SCRIPT_NAME = "generate-scratch";
// === Help Text ===
function printHelp(): void {
console.log(`
${SCRIPT_NAME} v${VERSION} - Create DOCX from scratch using JSON specification
Usage:
deno run --allow-read --allow-write scripts/${SCRIPT_NAME}.ts <spec.json> <output.docx>
Arguments:
<spec.json> Path to JSON specification file
<output.docx> Path for output Word document
Options:
-h, --help Show this help message
-v, --verbose Enable verbose output
Specification Format:
{
"title": "Document Title",
"creator": "Author Name",
"sections": [
{
"content": [
{
"text": "Hello World",
"heading": 1
},
{
"runs": [
{ "text": "Bold text", "bold": true },
{ "text": " and ", "italic": true },
{ "text": "normal text" }
]
}
]
}
]
}
Examples:
# Generate document
deno run --allow-read --allow-write scripts/${SCRIPT_NAME}.ts spec.json output.docx
# With verbose output
deno run --allow-read --allow-write scripts/${SCRIPT_NAME}.ts spec.json output.docx -v
`);
}
// === Conversion Helpers ===
function getAlignment(align?: string): typeof AlignmentType[keyof typeof AlignmentType] | undefined {
switch (align) {
case "left": return AlignmentType.LEFT;
case "center": return AlignmentType.CENTER;
case "right": return AlignmentType.RIGHT;
case "justify": return AlignmentType.JUSTIFIED;
default: return undefined;
}
}
function getHeadingLevel(level?: number): typeof HeadingLevel[keyof typeof HeadingLevel] | undefined {
switch (level) {
case 1: return HeadingLevel.HEADING_1;
case 2: return HeadingLevel.HEADING_2;
case 3: return HeadingLevel.HEADING_3;
case 4: return HeadingLevel.HEADING_4;
case 5: return HeadingLevel.HEADING_5;
case 6: return HeadingLevel.HEADING_6;
default: return undefined;
}
}
// === Element Builders ===
function buildTextRun(spec: TextRunSpec): typeof TextRun {
return new TextRun({
text: spec.text,
bold: spec.bold,
italics: spec.italic,
underline: spec.underline ? {} : undefined,
strike: spec.strike,
size: spec.fontSize ? spec.fontSize * 2 : undefined, // points to half-points
font: spec.font,
color: spec.color,
highlight: spec.highlight,
superScript: spec.superScript,
subScript: spec.subScript,
});
}
async function buildImageRun(spec: ImageSpec, specDir: string): Promise<typeof ImageRun> {
const imagePath = resolve(specDir, spec.path);
const imageData = await Deno.readFile(imagePath);
return new ImageRun({
data: imageData,
transformation: {
width: spec.width,
height: spec.height,
},
altText: spec.altText ? { title: spec.altText, description: spec.altText } : undefined,
});
}
function buildHyperlink(spec: HyperlinkSpec): typeof ExternalHyperlink {
return new ExternalHyperlink({
link: spec.url,
children: [
new TextRun({
text: spec.text,
bold: spec.bold,
italics: spec.italic,
style: "Hyperlink",
}),
],
});
}
function isHyperlink(spec: TextRunSpec | HyperlinkSpec | ImageSpec): spec is HyperlinkSpec {
return "url" in spec;
}
function isImage(spec: TextRunSpec | HyperlinkSpec | ImageSpec): spec is ImageSpec {
return "path" in spec && "width" in spec;
}
// deno-lint-ignore no-explicit-any
async function buildParagraph(spec: ParagraphSpec, specDir: string): Promise<any> {
// deno-lint-ignore no-explicit-any
const children: any[] = [];
if (spec.pageBreakBefore) {
children.push(new PageBreak());
}
if (spec.text) {
children.push(new TextRun(spec.text));
}
if (spec.runs) {
for (const run of spec.runs) {
if (isImage(run)) {
children.push(await buildImageRun(run, specDir));
} else if (isHyperlink(run)) {
children.push(buildHyperlink(run));
} else {
children.push(buildTextRun(run));
}
}
}
return new Paragraph({
children,
heading: getHeadingLevel(spec.heading),
alignment: getAlignment(spec.alignment),
bullet: spec.bullet ? { level: 0 } : undefined,
numbering: spec.numbering ? { reference: "default-numbering", level: 0 } : undefined,
spacing: spec.spacing ? {
before: spec.spacing.before,
after: spec.spacing.after,
line: spec.spacing.line,
} : undefined,
indent: spec.indent ? {
left: spec.indent.left,
right: spec.indent.right,
firstLine: spec.indent.firstLine,
} : undefined,
});
}
// deno-lint-ignore no-explicit-any
async function buildTableCell(spec: TableCellSpec, specDir: string): Promise<any> {
const paragraphs = await Promise.all(
spec.content.map((p) => buildParagraph(p, specDir))
);
return new TableCell({
children: paragraphs,
width: spec.width ? { size: spec.width, type: WidthType.PERCENTAGE } : undefined,
rowSpan: spec.rowSpan,
columnSpan: spec.colSpan,
shading: spec.shading ? { fill: spec.shading } : undefined,
verticalAlign: spec.verticalAlign === "center" ? "center" :
spec.verticalAlign === "bottom" ? "bottom" : undefined,
});
}
// deno-lint-ignore no-explicit-any
async function buildTable(spec: TableSpec, specDir: string): Promise<any> {
const rows = await Promise.all(
spec.rows.map(async (rowSpec) => {
const cells = await Promise.all(
rowSpec.cells.map((c) => buildTableCell(c, specDir))
);
return new TableRow({
children: cells,
tableHeader: rowSpec.isHeader,
});
})
);
return new Table({
rows,
width: spec.width ? { size: spec.width, type: WidthType.PERCENTAGE } : undefined,
borders: spec.borders === false ? {
top: { style: BorderStyle.NONE },
bottom: { style: BorderStyle.NONE },
left: { style: BorderStyle.NONE },
right: { style: BorderStyle.NONE },
insideHorizontal: { style: BorderStyle.NONE },
insideVertical: { style: BorderStyle.NONE },
} : undefined,
});
}
function isTable(item: ParagraphSpec | TableSpec | { pageBreak: true }): item is TableSpec {
return "rows" in item;
}
function isPageBreak(item: ParagraphSpec | TableSpec | { pageBreak: true }): item is { pageBreak: true } {
return "pageBreak" in item;
}
// deno-lint-ignore no-explicit-any
async function buildHeaderFooter(spec: HeaderFooterSpec, specDir: string): Promise<any[]> {
return Promise.all(spec.paragraphs.map((p) => buildParagraph(p, specDir)));
}
// === Core Logic ===
export async function generateFromSpec(
spec: DocumentSpec,
outputPath: string,
options: { verbose?: boolean; specDir?: string } = {}
): Promise<void> {
const { verbose = false, specDir = "." } = options;
if (verbose) {
console.error(`Creating document with ${spec.sections.length} section(s)`);
}
// Build sections
// deno-lint-ignore no-explicit-any
const sections: any[] = [];
for (let i = 0; i < spec.sections.length; i++) {
const sectionSpec = spec.sections[i];
// deno-lint-ignore no-explicit-any
const children: any[] = [];
if (verbose) {
console.error(`Processing section ${i + 1}: ${sectionSpec.content.length} items`);
}
for (const item of sectionSpec.content) {
if (isPageBreak(item)) {
children.push(new Paragraph({ children: [new PageBreak()] }));
} else if (isTable(item)) {
children.push(await buildTable(item, specDir));
} else {
children.push(await buildParagraph(item, specDir));
}
}
// deno-lint-ignore no-explicit-any
const section: any = { children };
// Add header
if (sectionSpec.header) {
section.headers = {
default: new Header({
children: await buildHeaderFooter(sectionSpec.header, specDir),
}),
};
}
// Add footer
if (sectionSpec.footer) {
section.footers = {
default: new Footer({
children: await buildHeaderFooter(sectionSpec.footer, specDir),
}),
};
}
sections.push(section);
}
// Create document
const doc = new Document({
title: spec.title,
subject: spec.subject,
creator: spec.creator,
description: spec.description,
styles: spec.styles ? {
default: {
document: {
run: {
font: spec.styles.defaultFont,
size: spec.styles.defaultFontSize ? spec.styles.defaultFontSize * 2 : undefined,
},
},
},
} : undefined,
sections,
});
// Write output
const buffer = await Packer.toBuffer(doc);
await Deno.writeFile(outputPath, new Uint8Array(buffer));
if (verbose) {
console.error(`Wrote ${outputPath}`);
}
}
// === Main CLI Handler ===
async function main(args: string[]): Promise<void> {
const parsed = parseArgs(args, {
boolean: ["help", "verbose"],
alias: { help: "h", verbose: "v" },
default: { verbose: false },
}) as ParsedArgs;
if (parsed.help) {
printHelp();
Deno.exit(0);
}
const positionalArgs = parsed._.map(String);
if (positionalArgs.length < 2) {
console.error("Error: Both spec.json and output.docx are required\n");
printHelp();
Deno.exit(1);
}
const specPath = positionalArgs[0];
const outputPath = positionalArgs[1];
try {
const specText = await Deno.readTextFile(specPath);
const spec = JSON.parse(specText) as DocumentSpec;
const specDir = dirname(resolve(specPath));
await generateFromSpec(spec, outputPath, {
verbose: parsed.verbose,
specDir,
});
console.log(`Created: ${outputPath}`);
} catch (error) {
console.error(
"Error:",
error instanceof Error ? error.message : String(error)
);
Deno.exit(1);
}
}
// === Entry Point ===
if (import.meta.main) {
main(Deno.args);
}
```