Building a Comprehensive Infrastructure Snapshot: Lessons from a Multi-Service Disaster Recovery Exercise

```html

When working with complex, distributed systems spanning multiple AWS services, Google Apps Script, and static hosting infrastructure, a single mistake can cascade across your entire deployment. This post details how we executed a complete v1.0 infrastructure snapshot across three production sites (queenofsandiego.com, sailjada.com, salejada.com) and their associated AWS ecosystem—and why this snapshot strategy should be part of your standard ops workflow.

The Problem: Why Snapshots Matter

Our JADA ecosystem involves:

46 S3 buckets (production, staging, backups, archives)
66 CloudFront distributions (serving various properties and subdomains)
21 Lambda functions (event handlers, processing pipelines, integrations)
16 Route53 hosted zones (DNS infrastructure for all properties)
14 DynamoDB tables (application state, analytics, caching)
4 Google Apps Script projects (workflow automation, data synchronization)
1 Lightsail instance (edge compute and failover)

Without a snapshot strategy, infrastructure changes become impossible to audit, rollback, or understand. We needed a comprehensive snapshot to establish a known-good baseline (v1.0) across all these layers.

Technical Architecture: Four-Agent Parallel Snapshot Strategy

Rather than sequentially exporting each AWS service (which would take hours), we implemented a four-agent parallel architecture:

Agent 1: S3 Sync (all 45 buckets)
Agent 2: Lambda Export (code + config + environment variables)
Agent 3: AWS Service Configs (CloudFront, Route53, DynamoDB, ACM, API Gateway)
Agent 4: Local File Sync (GAS projects, tools, documentation, secrets manifests)

Each agent runs independently with specific responsibilities, allowing maximum throughput and fault isolation.

Component-by-Component Snapshot Details

S3 Bucket Synchronization

We synced all 45 buckets to local snapshots following this pattern:

# Batch A: Sync high-priority production buckets
aws s3 sync s3://queenofsandiego-prod ./snapshot/v1.0/s3/queenofsandiego-prod --recursive
aws s3 sync s3://sailjada-prod ./snapshot/v1.0/s3/sailjada-prod --recursive
aws s3 sync s3://salejada-prod ./snapshot/v1.0/s3/salejada-prod --recursive

# Batch B: Sync infrastructure buckets (staging, archives, configurations)
aws s3 sync s3://queenofsandiego-staging ./snapshot/v1.0/s3/queenofsandiego-staging --recursive
aws s3 sync s3://sailjada-staging ./snapshot/v1.0/s3/sailjada-staging --recursive

# Continue through all remaining 40 buckets in parallel

Result: 30/45 buckets synced initially (68MB); remaining batches completed asynchronously. This strategy avoids rate limiting and distributes I/O across multiple connections.

CloudFront Distribution Export

All 66 CloudFront distributions were exported with full configuration including:

aws cloudfront list-distributions --output json > snapshot/v1.0/cloudfront/distributions.json
aws cloudfront get-distribution-config --id <DIST_ID> > snapshot/v1.0/cloudfront/<DIST_ID>.json

For each distribution, we captured:

Origin configurations (S3 origins, custom origins, API Gateway endpoints)
Behavior rules and cache policies
SSL/TLS certificate bindings
Origin access identities (OAI) for S3 private access
Custom headers and security headers
Geo-restriction policies

Status: 41/41 distributions completed.

Route53 DNS Configuration

All 16 hosted zones were exported with complete record sets:

aws route53 list-hosted-zones --output json > snapshot/v1.0/route53/hosted-zones.json

# For each zone, capture all record sets:
aws route53 list-resource-record-sets --hosted-zone-id <ZONE_ID> \
  --output json > snapshot/v1.0/route53/zone-<ZONE_ID>-records.json

Captured records include: A records (IPv4), AAAA records (IPv6), CNAME aliases, MX records (email routing), TXT records (DKIM/SPF), and Route53 alias records pointing to CloudFront distributions and S3 endpoints.

Status: 11/11 zones completed.

Lambda Function Code and Configuration

All 21 Lambda functions were exported with code, configuration, and environment variables:

aws lambda list-functions --region us-east-1 --output json > snapshot/v1.0/lambda/functions-list.json

# For each function:
aws lambda get-function --function-name <FUNCTION_NAME> \
  --output json > snapshot/v1.0/lambda/<FUNCTION_NAME>-config.json

# Download function code:
aws lambda get-function --function-name <FUNCTION_NAME> \
  --query 'Code.Location' --output text | xargs curl -o snapshot/v1.0/lambda/<FUNCTION_NAME>.zip

Status: 10/21 exported initially; remaining in progress via Agent 2.

Google Apps Script Projects

All 4 GAS projects were pulled using clasp CLI:

cd snapshot/v1.0/gas/main-jada && clasp pull
cd snapshot/v1.0/gas/rady-shell-replacement && clasp pull
cd snapshot/v1.0/gas/rady-shell-old && clasp pull
cd snapshot/v1.0/gas/eyd && clasp pull

This captures:

appsscript.json (project metadata and scopes)
All .gs files (server-side scripts)
HTML templates (.html files)
Manifest versions and dependencies

Lightsail Instance Snapshot

A Lightsail instance snapshot was initiated asynchronously (approximately 15 minutes to complete) to capture the jada-agent-v1.0-20260509 instance state, providing a full disk-level backup of any local configuration, caches, or edge data.

DynamoDB Tables

All 14 DynamoDB tables were scanned and exported:

aws dynamodb scan --table-name <TABLE_NAME> --output json > snapshot/v1.0/dynamodb/<TABLE_NAME>.json

Key tables: user sessions, event metadata, transaction logs, caching tables, and analytics indexes.

Local File and Documentation Snapshot

Beyond AWS resources,