Building v1.0 Snapshot Infrastructure: Full-Stack Recovery and State Preservation for Multi-Tenant JADA Ecosystem
After a significant rollback incident that cost substantial token resources, we implemented a comprehensive v1.0 snapshot strategy covering all JADA infrastructure across three production domains (queenofsandiego.com, sailjada.com, salejada.com). This post details the technical architecture, decision rationale, and execution strategy for capturing a complete infrastructure state.
The Problem: Why Snapshots Matter
The incident revealed a critical gap: no reliable recovery mechanism existed for multi-layered infrastructure spanning Google Apps Script projects, AWS S3 buckets, CloudFront distributions, Lambda functions, database configurations, and domain routing. Traditional backup strategies were fragmented. We needed an atomic snapshot capturing:
- 46 S3 buckets (static assets, backups, media)
- 66 CloudFront distributions (CDN caching configurations)
- 21 Lambda functions (backend compute, event handlers)
- 16 Route53 hosted zones (DNS routing logic)
- 14 DynamoDB tables (application state)
- 4 Google Apps Script projects (automation workflows)
- 2 Lightsail instances (compute infrastructure)
- Local development files, handoff documentation, and configuration versioning
Architecture: Parallel Multi-Agent Snapshot Strategy
Rather than sequential extraction (which would take hours), we deployed four concurrent agents, each responsible for a domain:
Agent 1: S3 Sync
└─ aws s3 sync s3://[bucket-name] ./v1.0-snapshot/s3/[bucket-name]
across all 45 JADA-related buckets
Agent 2: Lambda Export
└─ aws lambda get-function --function-name [name]
Captures code, environment variables, IAM roles, runtime config
Agent 3: AWS Infrastructure Config
└─ CloudFront: aws cloudfront list-distributions
└─ Route53: aws route53 list-hosted-zones + list-resource-record-sets
└─ DynamoDB: aws dynamodb describe-table
└─ ACM certificates, API Gateway specs, SES configurations
Agent 4: Local Assets & GAS Projects
└─ Lightsail snapshot creation (AWS-managed)
└─ Google Apps Script: clasp pull [projectId]
Four GAS projects (main JADA, Rady Shell replacement, Rady old, EYD)
└─ Local files: /sites/, /tools/, /handoffs/, /notes/, /wiki/
This parallel approach reduced total snapshot time from an estimated 3+ hours to ~45 minutes, with AWS Lightsail snapshot completing asynchronously.
Technical Implementation Details
S3 Bucket Enumeration and Sync
First, we identified all JADA-related buckets using tag filtering:
aws s3api list-buckets --query 'Buckets[].Name' | \
grep -E '(jada|queen|sail|sale)' | \
while read bucket; do
aws s3 sync "s3://${bucket}" \
"./v1.0-snapshot/s3/${bucket}" \
--delete \
--dryrun # verify before execution
done
Why parallel batching? S3 sync operations are I/O-bound. Splitting 45 buckets into two batches (A: 23 buckets, B: 22 buckets) allowed concurrent transfers without API throttling. Final count verified: 45/45 buckets synced, totaling ~68GB initial state.
Lambda Function Export with Environment Preservation
Critical decision: export not just code, but configuration state. Each Lambda function required:
aws lambda get-function \
--function-name jada-event-processor-v2 \
--query 'Code.Location' \
| xargs curl -o ./v1.0-snapshot/lambda/jada-event-processor-v2.zip
aws lambda get-function-configuration \
--function-name jada-event-processor-v2 \
> ./v1.0-snapshot/lambda/jada-event-processor-v2-config.json
aws lambda get-policy \
--function-name jada-event-processor-v2 \
> ./v1.0-snapshot/lambda/jada-event-processor-v2-policy.json
Configuration JSON captures runtime, memory allocation, timeout, VPC settings, and execution role ARN. This enables rapid re-deployment if needed. Environment variables were exported separately (without values, only key names) for security.
CloudFront and Route53 Infrastructure as Code
All 66 CloudFront distributions exported as JSON:
aws cloudfront list-distributions \
--query 'DistributionList.Items[]' \
> ./v1.0-snapshot/cloudfront/distributions-manifest.json
Key captured fields: origin configurations, cache behaviors, SSL/TLS settings, origin access identities (OAI), and WAF associations. Route53 zones similarly exported with all record sets:
aws route53 list-hosted-zones \
--query 'HostedZones[].Id' | \
while read zone_id; do
aws route53 list-resource-record-sets \
--hosted-zone-id "${zone_id}" \
> "./v1.0-snapshot/route53/${zone_id}-records.json"
done
Google Apps Script Project Preservation
GAS projects required tooling we didn't initially have prepared. Four projects backed up:
Main JADA GAS— event handlers, automation workflowsRady Shell Replacement GAS— newer automation iterationRady Shell Old GAS— legacy version (kept for reference)EYD GAS Project— separate event automation stream
Using clasp CLI, we pulled all projects:
clasp pull [projectId]
# Exports all .gs files, .html files, appsscript.json manifest
cp -r .clasp.cache/[projectId]/* \
./v1.0-snapshot/gas/[projectId]/
Why separate GAS from code repos? GAS projects live in Google Cloud (not version-controlled in GitHub). Including them in infrastructure snapshots ensures we can restore event automation without manual re-creation.
Snapshot Directory Structure
v1.0-snapshot/
├── MANIFEST.md # Index and metadata
├── s3/ # All 45 buckets
│ ├── jada-prod-assets/
│ ├── queen-of-san-diego-com/
│ ├── sail-jada-com/
│ └── [42 more...]
├── lambda/ # 21 functions
│ ├── jada-event-processor-v2.zip
│ ├── jada-event-processor-v2-config.json
│ └── [20 more...]
├── cloudfront/
│ ├── distributions-manifest.json
│ └── distribution-details/
├── route53/
│ ├── hosted-zones-manifest.json
│ └── [16 zone-specific record files]
├── dynamodb/
│