```html

Building a Comprehensive Infrastructure Snapshot: Lessons from a Full-Stack Rollback Recovery

When unexpected infrastructure changes rollback weeks of work, the response needs to be systematic and complete. This post documents the technical approach used to create a comprehensive v1.0 snapshot of a distributed three-site platform spanning AWS S3, CloudFront, Lambda, Route53, Google Apps Script, and local tooling infrastructure.

The Problem: Scope of Loss

The original incident affected three interconnected sites:

  • queenofsandiego.com — primary site with event pages, brand styling, navigation
  • sailjada.com — product catalog and listing pages
  • salejada.com — secondary product domain

Work was lost across multiple systems simultaneously: S3 bucket contents, CloudFront cache states, Google Apps Script project versions, and local development files. A point-in-time snapshot across all systems became critical.

Infrastructure Inventory: What Had to Be Captured

Before snapshotting, we enumerated the full infrastructure footprint:

  • S3 Buckets: 46 total across production, staging, and archive tiers
    • Primary: queenofsandiego.com, sailjada.com, salejada.com
    • Staging mirrors: queenofsandiego.com-staging, sailjada-staging, salejada-staging
    • Archive/backup buckets following naming convention *-backup, *-archive
    • Lambda deployment packages and build artifacts
  • CloudFront Distributions: 66 total
    • Primary distributions for each domain
    • Staging distribution origins pointing to -staging buckets
    • Legacy distributions (Bob Dylan catalog, Manager Candy support)
    • All origin configurations, cache behaviors, invalidation patterns
  • Lambda Functions: 21 total
    • Edge functions for header injection, authentication, request routing
    • Origin request/response handlers
    • Viewer request/response processors
    • Environment variables, IAM execution roles, VPC configurations
  • Route53 Hosted Zones: 16 total
    • DNS records, alias configurations, health checks
    • CNAME routing to CloudFront distributions
    • MX records for SES integration
  • Google Apps Script Projects: 4 total
    • Main JADA project (clasp ID tracking)
    • Rady Shell replacement automation
    • Rady Shell legacy version
    • EYD (Events and Yachts Database) project
  • Supporting AWS Services
    • DynamoDB tables (14 identified) for data persistence
    • SES configuration for email delivery
    • API Gateway endpoints
    • ACM certificates
    • IAM roles and policies
  • Local Development Infrastructure
    • /Users/cb/Documents/repos/ — primary repository root
    • tools/ directory containing update_dashboard.py, release.py
    • LaunchAgent configurations for background automation
    • Handoff documentation and runbooks
    • Environment configuration files (.env files tracked securely)

Snapshot Strategy: Parallel Execution

Given the volume of data and number of API calls required, we implemented four parallel agents to avoid timeout issues and maximize throughput:

# Agent 1: S3 Bucket Synchronization
# Sync all 46 S3 buckets to local snapshot directory
aws s3 sync s3://[bucket-name]/ /snapshot/v1.0/s3/[bucket-name]/ \
  --region us-west-2 \
  --no-progress \
  > /tmp/s3-sync-[bucket-name].log 2>&1 &

This agent synced all buckets in batches of 8-10 concurrent operations, totaling approximately 68MB of static assets, HTML files, CSS, JavaScript, and images. File count verification was performed after each batch to ensure no objects were skipped.

# Agent 2: Lambda Function Export
# Export all 21 Lambda function code, configuration, and environment
aws lambda get-function \
  --function-name [function-name] \
  --region us-west-2 \
  > /snapshot/v1.0/lambda/[function-name]/config.json

aws lambda get-function-code-signing-config \
  --function-arn arn:aws:lambda:us-west-2:[account-id]:function:[function-name] \
  > /snapshot/v1.0/lambda/[function-name]/signing-config.json

For each function, we captured: deployment package ZIP, environment variables, VPC configuration, execution role, layers, tags, and code signing configuration.

# Agent 3: AWS Infrastructure Export
# CloudFront distributions with all cache behaviors and origins
aws cloudfront list-distributions --query 'DistributionList.Items' \
  > /snapshot/v1.0/cloudfront/distributions-list.json

aws cloudfront get-distribution-config \
  --id [distribution-id] \
  > /snapshot/v1.0/cloudfront/[distribution-id]-config.json

# Route53 zones and record sets
aws route53 list-hosted-zones \
  > /snapshot/v1.0/route53/hosted-zones.json

aws route53 list-resource-record-sets \
  --hosted-zone-id [zone-id] \
  > /snapshot/v1.0/route53/[zone-id]-records.json

This agent also captured DynamoDB table schemas, SES domain configurations, API Gateway endpoints, and ACM certificate metadata.

# Agent 4: Google Apps Script and Local Files
# Pull latest versions from all GAS projects via clasp
clasp pull -r [JADA-main-project-id]
clasp pull -r [rady-replacement-project-id]
clasp pull -r [rady-legacy-project-id]
clasp pull -r [eyd-project-id]

# Copy to snapshot with source tracking
cp -r /path/to/gas/projects /snapshot/v1.0/gas/

Local development files were captured from /Users/cb/Documents/repos/, including all source code, tooling scripts, and configuration templates (with secrets redacted).

Lightsail Instance Snapshot

In parallel with the data exports, we initiated an AWS Lightsail instance snapshot named jada-agent-v1.0-20260509. This captures the full filesystem, running processes, and system state of any Lightsail-based infrastructure, providing a recovery point independent of S3 and