Migrating tech.queenofsandiego.com to a 5-Layer Model Workspace Protocol: Token Efficiency Through Structural Isolation

```html

What Was Done

We identified and began executing a pilot migration of the tech.queenofsandiego.com blog workflow from a monolithic, always-on context architecture to a 5-layer Model Workspace Protocol structure. The goal: reduce per-session token overhead from ~10,925 tokens (primarily wasted on redundant context) down to ~370 tokens—a 30× improvement—while maintaining full feature parity and establishing a replicable migration pattern for other workflows.

This pilot was deliberately scoped to the least mission-critical workflow to prove the token-efficiency wins before generalizing the pattern to higher-risk systems like the ticket-runner or deposit/booking infrastructure.

The Problem: Context Bloat and Layer Mixing

The original workflow loaded:

~942 tokens from scattered CLAUDE.md files across the repos structure
~3,420 tokens from the monolithic Queen of San Diego site CLAUDE.md (auto-loaded every session, regardless of context)
~6,563 tokens from tech_blog_generator.py, the 2,100+ line Python generator script
~90% waste: Layer-0 structural maps mixed with Layer-3 pricing rules, deployment safety checklists, and competitor analysis—none relevant to drafting a post about database indexing

Every session—whether for topic selection, research gathering, draft review, or publication—paid the full 10,925-token cost upfront, then discarded 90% of it.

Technical Architecture: The 5-Layer Model Workspace Protocol

The migration restructures context into five discrete layers, each loaded only when needed:

Layer 0 (Map): A tiny CLAUDE.md in /workspaces/tech-blog/ (~80 tokens) describing the entire workflow structure and routing logic. No rules, no pricing, no deploy checklists—just "go to Layer 1 for topic work, Layer 2 for research," etc.
Layer 1 (Router & Defaults): A CONTEXT.md file that bridges Layer 0 to stages. Loads the voice guide (reference/voice.md) and stage metadata. ~120 tokens.
Layer 2 (Stage-Specific Context): Four directories—01-topic/, 02-research/, 03-draft/, 04-publish/—each with its own CONTEXT.md (~80–120 tokens per stage). Topic stage loads post-idea templates; research stage loads source-citation rules; draft stage loads the style guide; publish stage loads the deployment checklist.
Layer 3 (Domain Rules): Extracted into reference/ subdirectories. voice.md for tone and audience, deploy-safety.md for publication guardrails, pricing.md for any monetization context (if needed in future), business.md for competitive/strategic context. None auto-loaded; pulled in by Layer-2 CONTEXT.md only when the stage requires it.
Layer 4 (Artifacts): Per-run state, drafts, research notes, and generated outputs. Kept in stage directories to maintain lineage and avoid accumulating context debt across sessions.

Directory Structure and File Organization


/Users/cb/Library/Mobile Documents/com~apple~CloudDocs/repos/workspaces/tech-blog/
├── CLAUDE.md                          # Layer 0: 80 tok, map only
├── CONTEXT.md                         # Layer 1: 120 tok, router + voice
├── reference/
│   ├── voice.md                       # Tone, audience, post structure
│   ├── deploy-safety.md               # Publication checklist & safety rules
│   ├── pricing.md                     # Monetization context (stage-gated)
│   └── business.md                    # Competitive analysis (stage-gated)
├── 01-topic/
│   ├── CONTEXT.md                     # Idea refinement rules, templates
│   └── [artifacts: post-ideas, outlines, feedback]
├── 02-research/
│   ├── CONTEXT.md                     # Source citation rules, fact-check gates
│   └── [artifacts: sources, notes, evidence chains]
├── 03-draft/
│   ├── CONTEXT.md                     # Writing style, editing rules, review gates
│   └── [artifacts: drafts, revision history, editor feedback]
└── 04-publish/
    ├── CONTEXT.md                     # Final checks, deploy sequence, rollback rules
    └── [artifacts: final post, deploy logs, live URLs]

Token Accounting: Before and After

Before (monolithic):

Repos CLAUDE.md: ~942 tokens
QOS site CLAUDE.md: ~3,420 tokens
tech_blog_generator.py: ~6,563 tokens
Total per session: ~10,925 tokens (all loaded, most unused)

After (5-layer pilot):

Layer 0 (map): ~80 tokens
Layer 1 (router): ~120 tokens
Layer 2 (stage, e.g., 03-draft): ~100 tokens
Layer 3 (referenced, e.g., voice.md + deploy-safety.md): ~70 tokens
Total per session: ~370 tokens (only relevant context loaded)

The generator script is replaced by stage-specific CONTEXT.md files that describe rules inline; no 2,100-line Python blob overhead.

Key Decisions and Rationale

Why a monolithic Layer 0 map? The map is tiny, always needed, and serves as a cognitive index. It's the only file auto-loaded; everything else is on-demand. This breaks the assumption that "CLAUDE.md = all context."
Why stage directories (01-, 02-, etc.)? Sequential naming makes workflow progression explicit and prevents accidental jumps. Each stage has its own state, so artifacts don't leak into unrelated sessions. The stage number also makes file references unambiguous across teams.
Why reference/ as a separate tree? Domain rules (voice, safety, pricing, business) are orthogonal to workflow stages. Isolating them allows rules to be updated without touching stage logic, and stages can optionally include them without bloating layer 2.
Why generate context.md files instead of a Python generator? Markdown context files are searchable, versionable, and don't require runtime dependencies. They're also self-documenting—a developer can read 03-draft/CONTEXT.md and immediately understand what rules apply, whereas Python code requires execution and inspection.
Why start with tech-blog, not higher-impact workflows? Tech-blog is read/write-only, isolated from critical systems,