Preventing S3 Deployment Regressions: A Case Study in Multi-Environment Safety
Last week, a deployment incident on queenofsandiego.com wiped three working features by syncing a stale local index.html over production S3, bypassing staging and ignoring prior session warnings. This post documents the root cause, the hard rules we've now codified to prevent recurrence, and the infrastructure patterns that should have caught it.
The Incident: What Happened
A session deployed to two S3 buckets simultaneously—staging and production—using a single command. The local index.html (last pulled 24+ hours prior) overwrote the production version at s3://queenofsandiego.com/index.html, destroying:
- The hero section JADA → BOOK NOW crossfade animation (CSS + JavaScript state machine)
- The Stripe embedded checkout integration (iframe init + sessionToken handling)
- The intentionally-removed "For Ranch & Coast readers..." hero variant (resurrected from an earlier commit)
The deployment also violated an existing rule already documented in session memory: staging-only deploys, with manual promotion to production after review.
Root Cause: Missing Safeguards in Deployment Flow
The failure chain:
- No pre-deploy diff: The session did not pull current S3 state and compare before overwriting.
- Multi-target single command:
aws s3 cptargeted bothstaging.sailjada.comandqueenofsandiego.comin one operation, violating the single-responsibility principle for deployments. - Ignored prior warnings: The session summary from the previous interaction explicitly noted that local files were stale relative to S3.
- No feature registry: There was no canonical list of active features to grep the current S3 bundle against before deploy.
- No snapshot before overwrite: S3 versioning is not enabled; once overwritten, the production version was gone.
Technical Safeguards Implemented
We've now added eight hard rules (D1–D8) to /Users/cb/Documents/repos/sites/queenofsandiego.com/CLAUDE.md, auto-loaded at session start:
D1: Pull and Diff Before Edit
aws s3 cp s3://queenofsandiego.com/index.html ./index.html.prod
diff -u index.html.prod index.html.local
Every session must fetch the current production version and display a unified diff before making changes. This surfaces stale local copies immediately.
D2: Staging-Only Single-Target Deploys
No command may deploy to both staging and production in one invocation. Syntax:
aws s3 cp ./index.html s3://staging.sailjada.com/index.html # STAGING ONLY
# After manual review:
aws s3 cp ./index.html s3://queenofsandiego.com/index.html # PROD ONLY, separate command
D3: One File Per Logical Change
Each deploy commit touches exactly one user-facing file (e.g., index.html) or one backend resource (e.g., a single GAS project). Multi-file deploys are escalated to CB.
D4: Honor Prior Session Warnings
If a prior session summary flagged stale local state, the current session must pull fresh from S3 before proceeding. This is non-negotiable.
D5: Snapshot Production Before Overwrite
Before any cp to production S3, save the current version locally:
aws s3 cp s3://queenofsandiego.com/index.html ./backups/index.html.$(date +%s)
# Only then proceed with new deploy
D6: Proof Block Before Deploy
Print a six-line proof block in chat before any cp` command:
FILE: index.html
TARGET: s3://queenofsandiego.com/ (PRODUCTION)
SIZE: 3,650 bytes
CONTAINS: [hero crossfade, Stripe checkout, BOOK NOW CTA]
EXCLUDES: [Ranch & Coast hero, stale ref params]
USER APPROVAL: [awaiting explicit go-ahead]
D7: Feature Token Registry
Maintain a FEATURES.registry file (currently in memory) listing all active user-facing features by token string:
FEATURE_JADA_HERO_FADE=true
FEATURE_STRIPE_CHECKOUT=true
FEATURE_RANCH_COAST_HERO=false # intentionally removed
FEATURE_BOOKING_REFERRAL=pending
Before deployment, grep the bundle-to-deploy against this registry and report any missing features as FAIL.
D8: Escalate on S3-Ahead State
If S3 is ahead of local (i.e., has commits or changes not in the working tree), escalate to CB before proceeding. This is a sign of parallel work or a missed pull.
Infrastructure: Deployment Targets and Tooling
Current S3 buckets and CloudFront distributions:
s3://queenofsandiego.com(production) → CloudFront distributionD*PROD*(origin: S3 bucket root)s3://staging.sailjada.com(staging) → CloudFront distributionD*STAGING*(separate origin)s3://sailjada.com(sailJADA production) → CloudFront distributionD*JADA*
Invalidation on deploy:
aws cloudfront create-invalidation --distribution-id D*PROD* --paths "/*"
Route53 zone queenofsandiego.com points CNAME www.queenofsandiego.com and root alias to CloudFront distribution. No versioning is currently enabled on S3 buckets; snapshots (D5) are now our versioning layer.
Key Decision: Why Not Lambda or CI/CD?
A question may arise: why not automate this with AWS CodePipeline or GitHub Actions? The answer is scope and risk:
- Human review is essential: These are small, high-value properties (booking sites, event platforms). A broken deploy directly loses revenue.
- Complexity is low: Single-file HTML pushes don't justify pipeline overhead.
- Rules are cheaper: Codifying deployment discipline into memory + a checklist is faster to deploy and debug than CI/CD plumbing.
As volume grows, this may change—but for now, human-gated S3 deploys with hard rules are the right tradeoff.