Preventing Deployment Regressions: Hard Rules for S3-Backed Static Sites
Over the past 3 hours, a deployment incident on queenofsandiego.com wiped three working features by pushing a stale local index.html over a newer S3 production version. The hero JADA → BOOK NOW crossfade, Stripe embedded checkout, and a previously-removed "Ranch & Coast readers" headline all disappeared. This post documents the root cause, the architectural safeguards we implemented, and the specific hard rules now enforced for all S3-backed deployments.
What Went Wrong
The incident had two compounding failures:
- Stale local state: The local
index.htmlchecked out at an earlier commit than what was live in S3 production. No diff was run before editing or deploying. - Multi-target deploy: A single command pushed both
stagingandprodS3 prefixes simultaneously, violating the staging-first rule. Once prod was overwritten, rolling back required manual S3 restoration.
Neither failure triggered an alert because the deployment pipeline had no enforcement mechanism — only suggestions in prior session notes that were overlooked.
Technical Details: The Deployment Architecture
Queenofsandiego.com uses a three-layer S3 + CloudFront pattern:
- S3 bucket:
queenofsandiego.com(us-west-2) - S3 prefixes (virtual directories):
staging/and/(root = production) - CloudFront distribution:
E1A2B3C4D5E6F(cache invalidation required post-deploy) - Local source:
/Users/cb/Documents/repos/sites/queenofsandiego.com/
The index.html is 3,650 lines and contains inlined CSS, JavaScript, and multiple hero sections. Each hero section is a distinct feature toggle. When an old version overwrites the new, all subsequent features in that file also regress.
Root Cause Analysis
The deployment command looked roughly like:
aws s3 cp index.html s3://queenofsandiego.com/staging/index.html
aws s3 cp index.html s3://queenofsandiego.com/index.html
aws cloudfront create-invalidation --distribution-id E1A2B3C4D5E6F --paths "/*"
Problems:
- No
aws s3 syncoraws s3 cp s3://... .to pull the current prod version first. - No
diffbetween local and S3 prod before overwriting. - Both staging and prod deployed in one session, so a mistake affected both.
- No snapshot of prod S3 state before the cp (S3 versioning is not enabled on this bucket).
- No printed proof block in chat showing the exact bytes being deployed.
The Fix: Eight Hard Rules (D1–D8)
We encoded these rules into /Users/cb/Documents/repos/sites/queenofsandiego.com/CLAUDE.md, which auto-loads for every session touching this site:
- D1 — Pull S3 and diff first. Before any local edit on a file that lives in S3, run:
Compare local against production. If S3 is ahead, abort and escalate to CB.aws s3 cp s3://queenofsandiego.com/index.html ./index.html.prod diff -u index.html.prod index.html - D2 — Staging-only deploys, one target at a time. Never deploy to both
stagingandprodin one command sequence. Deploy to staging, verify, then manually promote to prod only after CB approval. - D3 — One logical change per deployment. If you're adding the Stripe checkout flow, deploy only that. Don't bundle an unrelated hero text change. Separate commits, separate deploys.
- D4 — Obey session-summary warnings. If a prior session ended with "stale local files detected" or "S3 ahead of git," run the pull-and-diff check before proceeding.
- D5 — Snapshot prod before overwriting. Since S3 versioning is disabled on this bucket, create a dated backup before any cp:
This is not a replacement for versioning, but a circuit breaker for emergency rollback.aws s3 cp s3://queenofsandiego.com/index.html s3://queenofsandiego.com/backups/index.html.$(date +%Y%m%d_%H%M%S) - D6 — Print a six-line proof block before any cp. Before pushing to S3, output:
This creates an auditable record in the chat and forces a pause to double-check.# Deploying index.html to s3://queenofsandiego.com/staging/ # MD5: [hash of local file] # Size: [bytes] # Features included: JADA_BOOK_NOW_FADE, STRIPE_CHECKOUT, [list others] # Target: staging/ only # Proceeding in 5 seconds... - D7 — Maintain a feature-token registry. In
index.html, mark each feature with a comment token:
Before deploying, grep the local file and the current S3 version:
If tokens disappear, halt deployment.grep "FEATURE:" index.html aws s3 cp s3://queenofsandiego.com/index.html - | grep "FEATURE:" - D8 — Escalate to CB if S3 is ahead of local. If the pull-and-diff shows S3 has changes not in local, stop and message CB before touching anything. This catches stale git checkouts and merge conflicts.
Infrastructure Changes: S3 Bucket Policy & CloudFront
No infrastructure changes were needed; these are deployment discipline rules. However, we recommend:
- Enable S3 versioning on
queenofsandiego.combucket (one-time, ~$0.03/month for 100 versions). This gives a one-click rollback if rules are violated. - Create a CloudFront origin access identity (OAI) and restrict S3 bucket policy to OAI only. This prevents accidental public overwrites via direct S3 API calls.
- Add a CloudFront cache behavior rule for
/backups/*with TTL = 0 (never cache backups), so rollbacks are instant.
Key Decisions
Why staging-first?