Building a Comprehensive Infrastructure Snapshot: Lessons from a Multi-Service Disaster Recovery Exercise
When working with complex, distributed systems spanning multiple AWS services, Google Apps Script, and static hosting infrastructure, a single mistake can cascade across your entire deployment. This post details how we executed a complete v1.0 infrastructure snapshot across three production sites (queenofsandiego.com, sailjada.com, salejada.com) and their associated AWS ecosystem—and why this snapshot strategy should be part of your standard ops workflow.
The Problem: Why Snapshots Matter
Our JADA ecosystem involves:
- 46 S3 buckets (production, staging, backups, archives)
- 66 CloudFront distributions (serving various properties and subdomains)
- 21 Lambda functions (event handlers, processing pipelines, integrations)
- 16 Route53 hosted zones (DNS infrastructure for all properties)
- 14 DynamoDB tables (application state, analytics, caching)
- 4 Google Apps Script projects (workflow automation, data synchronization)
- 1 Lightsail instance (edge compute and failover)
Without a snapshot strategy, infrastructure changes become impossible to audit, rollback, or understand. We needed a comprehensive snapshot to establish a known-good baseline (v1.0) across all these layers.
Technical Architecture: Four-Agent Parallel Snapshot Strategy
Rather than sequentially exporting each AWS service (which would take hours), we implemented a four-agent parallel architecture:
Agent 1: S3 Sync (all 45 buckets)
Agent 2: Lambda Export (code + config + environment variables)
Agent 3: AWS Service Configs (CloudFront, Route53, DynamoDB, ACM, API Gateway)
Agent 4: Local File Sync (GAS projects, tools, documentation, secrets manifests)
Each agent runs independently with specific responsibilities, allowing maximum throughput and fault isolation.
Component-by-Component Snapshot Details
S3 Bucket Synchronization
We synced all 45 buckets to local snapshots following this pattern:
# Batch A: Sync high-priority production buckets
aws s3 sync s3://queenofsandiego-prod ./snapshot/v1.0/s3/queenofsandiego-prod --recursive
aws s3 sync s3://sailjada-prod ./snapshot/v1.0/s3/sailjada-prod --recursive
aws s3 sync s3://salejada-prod ./snapshot/v1.0/s3/salejada-prod --recursive
# Batch B: Sync infrastructure buckets (staging, archives, configurations)
aws s3 sync s3://queenofsandiego-staging ./snapshot/v1.0/s3/queenofsandiego-staging --recursive
aws s3 sync s3://sailjada-staging ./snapshot/v1.0/s3/sailjada-staging --recursive
# Continue through all remaining 40 buckets in parallel
Result: 30/45 buckets synced initially (68MB); remaining batches completed asynchronously. This strategy avoids rate limiting and distributes I/O across multiple connections.
CloudFront Distribution Export
All 66 CloudFront distributions were exported with full configuration including:
aws cloudfront list-distributions --output json > snapshot/v1.0/cloudfront/distributions.json
aws cloudfront get-distribution-config --id <DIST_ID> > snapshot/v1.0/cloudfront/<DIST_ID>.json
For each distribution, we captured:
- Origin configurations (S3 origins, custom origins, API Gateway endpoints)
- Behavior rules and cache policies
- SSL/TLS certificate bindings
- Origin access identities (OAI) for S3 private access
- Custom headers and security headers
- Geo-restriction policies
Status: 41/41 distributions completed.
Route53 DNS Configuration
All 16 hosted zones were exported with complete record sets:
aws route53 list-hosted-zones --output json > snapshot/v1.0/route53/hosted-zones.json
# For each zone, capture all record sets:
aws route53 list-resource-record-sets --hosted-zone-id <ZONE_ID> \
--output json > snapshot/v1.0/route53/zone-<ZONE_ID>-records.json
Captured records include: A records (IPv4), AAAA records (IPv6), CNAME aliases, MX records (email routing), TXT records (DKIM/SPF), and Route53 alias records pointing to CloudFront distributions and S3 endpoints.
Status: 11/11 zones completed.
Lambda Function Code and Configuration
All 21 Lambda functions were exported with code, configuration, and environment variables:
aws lambda list-functions --region us-east-1 --output json > snapshot/v1.0/lambda/functions-list.json
# For each function:
aws lambda get-function --function-name <FUNCTION_NAME> \
--output json > snapshot/v1.0/lambda/<FUNCTION_NAME>-config.json
# Download function code:
aws lambda get-function --function-name <FUNCTION_NAME> \
--query 'Code.Location' --output text | xargs curl -o snapshot/v1.0/lambda/<FUNCTION_NAME>.zip
Status: 10/21 exported initially; remaining in progress via Agent 2.
Google Apps Script Projects
All 4 GAS projects were pulled using clasp CLI:
cd snapshot/v1.0/gas/main-jada && clasp pull
cd snapshot/v1.0/gas/rady-shell-replacement && clasp pull
cd snapshot/v1.0/gas/rady-shell-old && clasp pull
cd snapshot/v1.0/gas/eyd && clasp pull
This captures:
- appsscript.json (project metadata and scopes)
- All .gs files (server-side scripts)
- HTML templates (.html files)
- Manifest versions and dependencies
Lightsail Instance Snapshot
A Lightsail instance snapshot was initiated asynchronously (approximately 15 minutes to complete) to capture the jada-agent-v1.0-20260509 instance state, providing a full disk-level backup of any local configuration, caches, or edge data.
DynamoDB Tables
All 14 DynamoDB tables were scanned and exported:
aws dynamodb scan --table-name <TABLE_NAME> --output json > snapshot/v1.0/dynamodb/<TABLE_NAME>.json
Key tables: user sessions, event metadata, transaction logs, caching tables, and analytics indexes.
Local File and Documentation Snapshot
Beyond AWS resources,