Migrating tech.queenofsandiego.com to a 5-Layer Model Workspace Protocol: Token Efficiency Through Structured Context

```html

What Was Done

We began a pilot migration of the tech.queenofsandiego.com blog workflow from a monolithic context model to a structured 5-layer workspace protocol. This post covers the baseline analysis, architecture redesign, and the reasoning behind moving from a ~10,925-token always-on context to a target ~370-token lightweight design.

The tech blog was chosen as the pilot because it's mission-critical enough to validate real-world workflow patterns, but isolated enough that mistakes don't cascade into production outages. The baseline measurement revealed that approximately 90% of context was waste—mixing Layer 0 (architectural maps) with Layer 3 (pricing rules, deployment safety guardrails, and competitor analysis) in a single sprawling CLAUDE.md file.

Baseline Context Analysis

Before redesign, the tech blog workflow loaded approximately:

~942 tokens: Repository metadata and global CLAUDE.md files from /repos/ root
~3,420 tokens: Queenof SanDiego site-level CLAUDE.md auto-loaded every session from /repos/sites/queenofsandiego.com/CLAUDE.md
~6,563 tokens: The blog generator tool itself (tech_blog_generator.py), including embedded documentation, configuration examples, and safety rules

Total: ~10,925 tokens per session start, with no selectivity. A writer starting a new blog post post would pay this tax regardless of whether they needed deployment safety rules or competitor pricing analysis that session.

5-Layer Workspace Architecture

The redesign follows this structure under /repos/workspaces/tech-blog/:

workspaces/tech-blog/
├── CLAUDE.md                    # Layer 0: Tiny router (~80 tokens)
├── CONTEXT.md                   # Layer 1: Always-on session context
├── reference/
│   └── voice.md                 # Layer 2: Brand voice + style guide
├── 01-topic/
│   ├── CONTEXT.md               # Layer 2: Topic selection & research prep
│   └── [artifacts]/
├── 02-research/
│   ├── CONTEXT.md               # Layer 2: Source indexing & outlines
│   └── [artifacts]/
├── 03-draft/
│   ├── CONTEXT.md               # Layer 2: Draft rules & review gates
│   └── [artifacts]/
└── 04-publish/
    ├── CONTEXT.md               # Layer 2: Publishing & SEO rules
    └── [artifacts]/

Layer 0 (CLAUDE.md): A 10-line router file that says "you're in the tech blog workspace; which stage are you in?" This stays tiny and stable.

Layer 1 (CONTEXT.md at root): Session-wide context—tool definitions, workspace rules, file structure, and how to navigate between stages. Loaded once per session.

Layer 2 (reference/voice.md and stage-specific CONTEXT.md` files): Domain knowledge and rules scoped to what you're doing right now. A writer in stage 03-draft never pays for stage 02-research's source indexing rules.



Layers 3 & 4 (reserved): Deployment safety, pricing, and guardrails stay in the root /repos/sites/queenofsandiego.com/reference/ tree, loaded *only* when publishing (stage 04).

Token Win: Why This Matters

The 30× reduction (10,925 → ~370 tokens) isn't just about saving money. It's about clarity:


  Faster model context synthesis: Claude reads fewer irrelevant rules when drafting. Decisions are clearer because guardrails are staged.
  Easier handoffs: A new writer can start in stage 01 without understanding the entire deployment pipeline.
  Reduced hallucination risk: When pricing rules and competitor data aren't constantly in-context, the model doesn't accidentally blend them into blog posts.
  Scalability: Once this pattern is proven, it generalizes to all 12+ active workflows across queenofsandiego.com, sailjada.com, and dangerouscentaur properties.


Key Design Decisions

Why separate stage CONTEXT files instead of one giant docs/ folder? Workspace engines (including Claude's) load context based on *which tool or prompt* is active. By nesting CONTEXT.md at each stage directory, the system can be smart: a user opening 03-draft/my-post.md auto-loads the draft-specific rules, not the research rules. This is implicit, not manual.

Why keep reference/voice.md in the root? Brand voice is consistent across all stages. Keeping it at Layer 2 (not buried in stage directories) signals that it's universal, and a single copy reduces merge conflicts when style evolves.

Why not merge Layer 0 and Layer 1? Layer 0 is for the system—routing and file-structure metadata. Layer 1 is for humans—session context, tool descriptions, workflows. Keeping them separate means a CLAUDE.md can stay ~80 tokens (stable, rarely updated) while session context can evolve without touching the manifest.

Measurement & Validation

Baseline files measured in /repos/workspaces/tech-blog/:


  CLAUDE.md: ~80 tokens (router only)
  CONTEXT.md: ~240 tokens (session-wide rules, tool defs, stage navigation)
  reference/voice.md: ~50 tokens (brand voice guide)
  Per-stage overhead: ~0 tokens (stage CONTEXT.md files exist but are lazy-loaded only when needed)


Total on session start: ~370 tokens (Layer 0 + 1 + selective Layer 2).

When a user navigates to 04-publish/, publishing-specific context loads (~200 tokens), bringing the session-active context to ~570 tokens at peak. Still a 95% reduction compared to the monolithic baseline.

What's Next

The next phase involves:


  Populate stage CONTEXT.md files with rules extracted from the old monolithic CLAUDE.md, broken into stages.
  Validate end-to-end workflow: A test writer takes a blog post from topic selection through publication, measuring wall-clock time and noting any friction points.
  Breakage log: Document any patterns the old workflow supported that this new one doesn't, and fix them.
  Generalize the playbook: Once tech-blog passes validation, apply the same 5-layer pattern to deposit/booking flows, ticket-runner workflows, and other always-on contexts.
  Automate context routing: Build a helper