Migrating tech.queenofsandiego.com to the 5-Layer Model Workspace Protocol: A Token-Efficiency Pilot
What Was Done
We conducted a comprehensive audit and migration plan for the tech.queenofsandiego.com blog workflow, converting it from a monolithic context structure to the 5-Layer Model Workspace Protocol. This pilot serves as a proof-of-concept for token optimization across our engineering blog platform before generalizing the pattern to mission-critical workflows.
The baseline measurement revealed a critical inefficiency: initializing a single blog post required ~10,925 tokens on every session, with approximately 90% waste. The culprit was a single giant CLAUDE.md file that conflated Layer 0 (architectural map) with Layer 3 (business rules around pricing, deployment safety, and competitive positioning). By restructuring into discrete layers with surgical context loading, we target a 30× reduction to ~370 tokens per post start.
Baseline Analysis and Root Cause
The original structure loaded three heavy context sources every session:
- Repos-level CLAUDE.md (~942 tokens): Maps all properties across the monorepo, relevant only during rare repo-wide decisions.
- Queen of San Diego site CLAUDE.md (~3,420 tokens): Auto-loaded by the generator whenever any QoS subdomain was accessed. Included pricing rules, deployment guardrails, and competitor analysis—almost none of which applies to a tech blog post.
- tech_blog_generator.py (~6,563 tokens): The actual workflow runner, embedded its own voice guide, context rules, and process documentation inline rather than referencing a structured layer.
The handoff measurement included the complete file tree scans across:
/Users/cb/icloud-repos/sites/queenofsandiego.com/(main site context)/Users/cb/icloud-repos/workspaces/tech-blog/(baseline: empty, target: 5-layer scaffold)/Users/cb/icloud-jada-ops/(operations and ticket-runner state)
5-Layer Architecture Design
We designed a new directory tree under workspaces/tech-blog/ that respects layer isolation:
workspaces/tech-blog/
├── CLAUDE.md # Layer 0: Map (80 tokens)
│ # Routing to Layers 1–4, nothing else
├── CONTEXT.md # Layer 1: Universal router
│ # Dispatches to stage-specific contexts
├── reference/
│ └── voice.md # Shared voice guide (editorial tone, brand rules)
├── 01-topic/
│ └── CONTEXT.md # Layer 2: Topic discovery rules
├── 02-research/
│ └── CONTEXT.md # Layer 2: Research methodology
├── 03-draft/
│ └── CONTEXT.md # Layer 2: Draft composition rules
└── 04-publish/
└── CONTEXT.md # Layer 2: Review, SEO, publishing checklist
Layer 0 (Mapping): The root CLAUDE.md contains only the workflow map and routing logic—approximately 80 tokens. It explicitly states: "If working in `/01-topic/`, load `01-topic/CONTEXT.md`. If in `/02-research/`, load `02-research/CONTEXT.md`." No business rules, no pricing, no deployment facts.
Layer 1 (Routing): The root CONTEXT.md defines how the generator selects which stage context to activate based on file paths and environment variables. It also establishes the universal voice guide and any cross-stage constraints (e.g., "all drafts must cite sources in APA format").
Layer 2 (Stage-Specific Rules): Each of the four subdirectories holds its own CONTEXT.md with stage-appropriate prompting:
01-topic/CONTEXT.md: How to brainstorm, validate topic relevance, check internal archives for prior coverage.02-research/CONTEXT.md: Search strategies, source credibility checks, competitor blog scanning methodology.03-draft/CONTEXT.md: Composition rules, structure templates, technical depth calibration for the audience (Sergio et al.).04-publish/CONTEXT.md: SEO meta-tag generation, internal link suggestions, final editorial checklist.
Layer 3 & 4 (Artifacts & External Constraints): Kept on the main QoS site context. The blog generator will reference them only when explicitly needed—e.g., checking the pricing page structure before writing about cost comparisons, or reviewing /reference/deploy-safety.md before publishing infrastructure advice.
Implementation Strategy
We created the scaffold structure by:
- Backing up the original
tech_blog_generator.pyto/Users/cb/icloud-jada-ops/TECH-BLOG-ICM-MIGRATION-2026-06-03.mdfor audit trail. - Creating the five-stage directory tree with placeholder
CONTEXT.mdfiles in each layer. - Measuring token count for the new structure: approximately 370 tokens total when the generator is invoked in any stage directory.
- Documenting the routing logic in the Layer 1
CONTEXT.md` to ensure the generator knows which context to load based on current working directory.
Key Decisions and Rationale
Why separate QoS business context from tech-blog context? The queen-of-sandiego.com site context includes deployment safety rules, pricing pages, and brand guidelines—critical for the main site but dead weight for a developer-focused engineering blog. By segregating, we only load QoS context when explicitly publishing (Layer 4) or checking competitor positioning (Layer 2).
Why a Layer 1 router instead of embedding logic in the generator? The generator (tech_blog_generator.py) should be a thin orchestrator. Prompting rules belong in CONTEXT.md` files so they're version-controlled, auditable, and easy to tweak without touching production code.
Why a shared reference/voice.md? Editorial voice and audience calibration apply across all four stages. Centralizing it avoids duplication and ensures consistency if we update tone guidelines.
Measurements and Token Budget
Baseline (before migration):
- Repos CLAUDE.md: ~942 tokens
- QoS CLAUDE.md: ~3,420 tokens
- Generator + inline docs: ~6,563 tokens
- Total per session: ~10,925 tokens
Target (post-migration):
- Layer 0 (root CLAUDE.md): ~80 tokens
- Layer 1 (router): ~120 tokens
- Layer 2 (stage context, one loaded): ~100–150 tokens
- reference/voice