Migrating tech.queenofsandiego.com to the 5-Layer Model Workspace Protocol: A Token-Efficiency Pilot

```html

What Was Done

We conducted a comprehensive audit and migration plan for the tech.queenofsandiego.com blog workflow, converting it from a monolithic context structure to the 5-Layer Model Workspace Protocol. This pilot serves as a proof-of-concept for token optimization across our engineering blog platform before generalizing the pattern to mission-critical workflows.

The baseline measurement revealed a critical inefficiency: initializing a single blog post required ~10,925 tokens on every session, with approximately 90% waste. The culprit was a single giant CLAUDE.md file that conflated Layer 0 (architectural map) with Layer 3 (business rules around pricing, deployment safety, and competitive positioning). By restructuring into discrete layers with surgical context loading, we target a 30× reduction to ~370 tokens per post start.

Baseline Analysis and Root Cause

The original structure loaded three heavy context sources every session:

Repos-level CLAUDE.md (~942 tokens): Maps all properties across the monorepo, relevant only during rare repo-wide decisions.
Queen of San Diego site CLAUDE.md (~3,420 tokens): Auto-loaded by the generator whenever any QoS subdomain was accessed. Included pricing rules, deployment guardrails, and competitor analysis—almost none of which applies to a tech blog post.
tech_blog_generator.py (~6,563 tokens): The actual workflow runner, embedded its own voice guide, context rules, and process documentation inline rather than referencing a structured layer.

The handoff measurement included the complete file tree scans across:

/Users/cb/icloud-repos/sites/queenofsandiego.com/ (main site context)
/Users/cb/icloud-repos/workspaces/tech-blog/ (baseline: empty, target: 5-layer scaffold)
/Users/cb/icloud-jada-ops/ (operations and ticket-runner state)

5-Layer Architecture Design

We designed a new directory tree under workspaces/tech-blog/ that respects layer isolation:

workspaces/tech-blog/
├── CLAUDE.md                    # Layer 0: Map (80 tokens)
│                               # Routing to Layers 1–4, nothing else
├── CONTEXT.md                   # Layer 1: Universal router
│                               # Dispatches to stage-specific contexts
├── reference/
│   └── voice.md                 # Shared voice guide (editorial tone, brand rules)
├── 01-topic/
│   └── CONTEXT.md              # Layer 2: Topic discovery rules
├── 02-research/
│   └── CONTEXT.md              # Layer 2: Research methodology
├── 03-draft/
│   └── CONTEXT.md              # Layer 2: Draft composition rules
└── 04-publish/
    └── CONTEXT.md              # Layer 2: Review, SEO, publishing checklist

Layer 0 (Mapping): The root CLAUDE.md contains only the workflow map and routing logic—approximately 80 tokens. It explicitly states: "If working in `/01-topic/`, load `01-topic/CONTEXT.md`. If in `/02-research/`, load `02-research/CONTEXT.md`." No business rules, no pricing, no deployment facts.

Layer 1 (Routing): The root CONTEXT.md defines how the generator selects which stage context to activate based on file paths and environment variables. It also establishes the universal voice guide and any cross-stage constraints (e.g., "all drafts must cite sources in APA format").

Layer 2 (Stage-Specific Rules): Each of the four subdirectories holds its own CONTEXT.md with stage-appropriate prompting:

01-topic/CONTEXT.md: How to brainstorm, validate topic relevance, check internal archives for prior coverage.
02-research/CONTEXT.md: Search strategies, source credibility checks, competitor blog scanning methodology.
03-draft/CONTEXT.md: Composition rules, structure templates, technical depth calibration for the audience (Sergio et al.).
04-publish/CONTEXT.md: SEO meta-tag generation, internal link suggestions, final editorial checklist.

Layer 3 & 4 (Artifacts & External Constraints): Kept on the main QoS site context. The blog generator will reference them only when explicitly needed—e.g., checking the pricing page structure before writing about cost comparisons, or reviewing /reference/deploy-safety.md before publishing infrastructure advice.

Implementation Strategy

We created the scaffold structure by:

Backing up the original tech_blog_generator.py to /Users/cb/icloud-jada-ops/TECH-BLOG-ICM-MIGRATION-2026-06-03.md for audit trail.
Creating the five-stage directory tree with placeholder CONTEXT.md files in each layer.
Measuring token count for the new structure: approximately 370 tokens total when the generator is invoked in any stage directory.
Documenting the routing logic in the Layer 1 CONTEXT.md` to ensure the generator knows which context to load based on current working directory.



Key Decisions and Rationale

Why separate QoS business context from tech-blog context? The queen-of-sandiego.com site context includes deployment safety rules, pricing pages, and brand guidelines—critical for the main site but dead weight for a developer-focused engineering blog. By segregating, we only load QoS context when explicitly publishing (Layer 4) or checking competitor positioning (Layer 2).

Why a Layer 1 router instead of embedding logic in the generator? The generator (tech_blog_generator.py) should be a thin orchestrator. Prompting rules belong in CONTEXT.md` files so they're version-controlled, auditable, and easy to tweak without touching production code.


Why a shared reference/voice.md? Editorial voice and audience calibration apply across all four stages. Centralizing it avoids duplication and ensures consistency if we update tone guidelines.

Measurements and Token Budget

Baseline (before migration):

  Repos CLAUDE.md: ~942 tokens
  QoS CLAUDE.md: ~3,420 tokens
  Generator + inline docs: ~6,563 tokens
  Total per session: ~10,925 tokens


Target (post-migration):

  Layer 0 (root CLAUDE.md): ~80 tokens
  Layer 1 (router): ~120 tokens
  Layer 2 (stage context, one loaded): ~100–150 tokens
  reference/voice