Building a Comprehensive Infrastructure Snapshot: Lessons from a Full-Stack Rollback Recovery

```html

When unexpected infrastructure changes rollback weeks of work, the response needs to be systematic and complete. This post documents the technical approach used to create a comprehensive v1.0 snapshot of a distributed three-site platform spanning AWS S3, CloudFront, Lambda, Route53, Google Apps Script, and local tooling infrastructure.

The Problem: Scope of Loss

The original incident affected three interconnected sites:

queenofsandiego.com — primary site with event pages, brand styling, navigation
sailjada.com — product catalog and listing pages
salejada.com — secondary product domain

Work was lost across multiple systems simultaneously: S3 bucket contents, CloudFront cache states, Google Apps Script project versions, and local development files. A point-in-time snapshot across all systems became critical.

Infrastructure Inventory: What Had to Be Captured

Before snapshotting, we enumerated the full infrastructure footprint:

S3 Buckets: 46 total across production, staging, and archive tiers
- Primary: queenofsandiego.com, sailjada.com, salejada.com
- Staging mirrors: queenofsandiego.com-staging, sailjada-staging, salejada-staging
- Archive/backup buckets following naming convention *-backup, *-archive
- Lambda deployment packages and build artifacts
CloudFront Distributions: 66 total
- Primary distributions for each domain
- Staging distribution origins pointing to -staging buckets
- Legacy distributions (Bob Dylan catalog, Manager Candy support)
- All origin configurations, cache behaviors, invalidation patterns
Lambda Functions: 21 total
- Edge functions for header injection, authentication, request routing
- Origin request/response handlers
- Viewer request/response processors
- Environment variables, IAM execution roles, VPC configurations
Route53 Hosted Zones: 16 total
- DNS records, alias configurations, health checks
- CNAME routing to CloudFront distributions
- MX records for SES integration
Google Apps Script Projects: 4 total
- Main JADA project (clasp ID tracking)
- Rady Shell replacement automation
- Rady Shell legacy version
- EYD (Events and Yachts Database) project
Supporting AWS Services
- DynamoDB tables (14 identified) for data persistence
- SES configuration for email delivery
- API Gateway endpoints
- ACM certificates
- IAM roles and policies
Local Development Infrastructure
- /Users/cb/Documents/repos/ — primary repository root
- tools/ directory containing update_dashboard.py, release.py
- LaunchAgent configurations for background automation
- Handoff documentation and runbooks
- Environment configuration files (.env files tracked securely)

Snapshot Strategy: Parallel Execution

Given the volume of data and number of API calls required, we implemented four parallel agents to avoid timeout issues and maximize throughput:

# Agent 1: S3 Bucket Synchronization
# Sync all 46 S3 buckets to local snapshot directory
aws s3 sync s3://[bucket-name]/ /snapshot/v1.0/s3/[bucket-name]/ \
  --region us-west-2 \
  --no-progress \
  > /tmp/s3-sync-[bucket-name].log 2>&1 &

This agent synced all buckets in batches of 8-10 concurrent operations, totaling approximately 68MB of static assets, HTML files, CSS, JavaScript, and images. File count verification was performed after each batch to ensure no objects were skipped.

# Agent 2: Lambda Function Export
# Export all 21 Lambda function code, configuration, and environment
aws lambda get-function \
  --function-name [function-name] \
  --region us-west-2 \
  > /snapshot/v1.0/lambda/[function-name]/config.json

aws lambda get-function-code-signing-config \
  --function-arn arn:aws:lambda:us-west-2:[account-id]:function:[function-name] \
  > /snapshot/v1.0/lambda/[function-name]/signing-config.json

For each function, we captured: deployment package ZIP, environment variables, VPC configuration, execution role, layers, tags, and code signing configuration.

# Agent 3: AWS Infrastructure Export
# CloudFront distributions with all cache behaviors and origins
aws cloudfront list-distributions --query 'DistributionList.Items' \
  > /snapshot/v1.0/cloudfront/distributions-list.json

aws cloudfront get-distribution-config \
  --id [distribution-id] \
  > /snapshot/v1.0/cloudfront/[distribution-id]-config.json

# Route53 zones and record sets
aws route53 list-hosted-zones \
  > /snapshot/v1.0/route53/hosted-zones.json

aws route53 list-resource-record-sets \
  --hosted-zone-id [zone-id] \
  > /snapshot/v1.0/route53/[zone-id]-records.json

This agent also captured DynamoDB table schemas, SES domain configurations, API Gateway endpoints, and ACM certificate metadata.

# Agent 4: Google Apps Script and Local Files
# Pull latest versions from all GAS projects via clasp
clasp pull -r [JADA-main-project-id]
clasp pull -r [rady-replacement-project-id]
clasp pull -r [rady-legacy-project-id]
clasp pull -r [eyd-project-id]

# Copy to snapshot with source tracking
cp -r /path/to/gas/projects /snapshot/v1.0/gas/

Local development files were captured from /Users/cb/Documents/repos/, including all source code, tooling scripts, and configuration templates (with secrets redacted).

Lightsail Instance Snapshot

In parallel with the data exports, we initiated an AWS Lightsail instance snapshot named jada-agent-v1.0-20260509. This captures the full filesystem, running processes, and system state of any Lightsail-based infrastructure, providing a recovery point independent of S3 and