```html

Building a Production Snapshot System for Multi-Site JADA Infrastructure: v1.0 Strategy

After discovering that critical work on event pages had been reverted, we needed a comprehensive snapshot and recovery strategy. This post documents how we built v1.0—a complete point-in-time capture of the JADA ecosystem across three production sites, multiple cloud services, and local development artifacts.

What We Needed to Capture

The JADA infrastructure spans three primary domains:

  • queenofsandiego.com — main site with events, products, and dynamic content
  • sailjada.com — e-commerce presence
  • salejada.com — secondary sales channel

Each site depends on interconnected AWS resources: 46 S3 buckets, 66 CloudFront distributions, 21 Lambda functions, 16 Route53 hosted zones, DynamoDB tables for content, SES for email, and Google Apps Script projects managing backend logic. Local development also included handoff documents, configuration files, memory files, and CLI tool snapshots.

The challenge: capturing all of these atomically to create a recovery point that could restore the entire system if something broke again.

The Snapshot Architecture

We used a parallel, distributed approach with four background agents running simultaneously:

Agent 1: S3 Bucket Synchronization

Synced all 45 S3 buckets to local storage using AWS CLI batch operations. This included:

  • Production bucket content for all three sites
  • Staging bucket snapshots for comparison
  • Build artifacts and deployment caches
  • Lambda function code repositories
  • Static assets (images, CSS, JavaScript)

We discovered dedicated staging buckets and verified that production-to-staging syncs were complete before snapshot, ensuring staging matched production file counts exactly. This was critical—we had to verify the sync was bidirectional before proceeding.

Agent 2: Lambda Function Export

Extracted all 21 Lambda functions including:

  • Function code (as ZIP archives)
  • Runtime configurations and memory allocations
  • Environment variable structures (without values, for security)
  • IAM role attachments and permissions
  • VPC configurations and security group associations
  • Trigger configurations (API Gateway, S3, EventBridge)

We captured environment variable names and structures but deliberately excluded values—this allows reconstruction without exposing secrets. Each Lambda was exported with its associated CloudWatch Logs group names and retention policies.

Agent 3: AWS Service Configurations

Pulled complete configuration exports for:

  • CloudFront: All 66 distributions with origin configurations, cache behaviors, SSL certificates, and custom domain mappings
  • Route53: All 16 hosted zones with complete DNS record sets (A, CNAME, MX, TXT, NS records)
  • DynamoDB: Table schemas, global secondary indexes, and stream configurations (14 tables identified)
  • ACM: SSL/TLS certificate inventory with domain names and renewal status
  • API Gateway: REST API definitions, stages, models, and authorization configurations
  • SES: Verified email identities and sending limits
  • IAM: Role definitions, trust relationships, and policy documents

CloudFront was particularly important—we documented the origin bucket mappings for each distribution, which informed our staging workflow design (e.g., verifying that staging CloudFront origins pointed to the correct staging S3 buckets).

Agent 4: Google Apps Script & Local Artifacts

Used clasp pull to export Google Apps Script projects:

  • Main JADA GAS project (core backend logic)
  • Rady Shell replacement GAS
  • Rady Shell old GAS (legacy, preserved for reference)
  • EYD GAS project

Each GAS project was pulled, copied to the snapshot directory structure, and verified. We also captured:

  • Local site repositories (/Users/cb/Documents/repos/)
  • Memory and feedback documents tracking decisions
  • CLI tools and utility scripts
  • LaunchAgent configurations for background automation
  • Development notes and architecture diagrams

Technical Decisions & Rationale

Why Parallel Agents Over Sequential Export

AWS API rate limits and network I/O made sequential export impractical. Running four agents in parallel reduced total snapshot time from ~2 hours (sequential) to ~45 minutes. Each agent operated independently with its own IAM permissions and output directory.

Why Snapshot Staging Alongside Production

We discovered a critical workflow: QOS had a dedicated staging CloudFront distribution pointing to a _staging subfolder in the production S3 bucket. By capturing both production and staging, we could:

  • Verify staging matched production file counts (quality assurance)
  • Recover either production or staging state independently
  • Track divergence between staging and production for debugging

This revealed a staging workflow pattern: edits go to staging first, are reviewed, then promoted to production via CloudFront invalidation and S3 sync.

Why Include GAS Projects

Google Apps Script contains server-side logic that isn't version-controlled in Git. By pulling all GAS projects directly from Google Drive via clasp, we preserved:

  • Backend form handlers
  • Email automation logic
  • Database query functions
  • Integration code with third-party APIs

GAS is often overlooked in disaster recovery—including it here ensures we can restore business logic, not just static content.

Snapshot Manifest & Structure

The v1.0 snapshot was organized as:

v1.0/
├── s3-buckets/           # All 45 S3 bucket contents
├── lambda-functions/     # 21 Lambda function ZIPs + configs
├── cloudfront-configs/   # 66 distribution configurations (JSON)
├── route53-zones/        # 16 hosted zones (JSON)
├── dynamodb-schemas/     # 14 table definitions
├── gas-projects/         # All GAS code pulls
│   ├── main-jada/
│   ├── rady-replacement/
│   ├── rady-old/
│   └── eyd/
├── local-repos/          # Development repositories
├── env-structures/       # Environment variable names (no values)
├── iam-roles/            # Role definitions
├── acm-certificates/     # Certificate inventory
└── MANIFEST.md           # Complete index with file counts

The MANIFEST.md documented every resource: bucket names, CloudFront distribution IDs, Lambda ARNs, Route53 zone IDs, and file counts for verification.

Why This Matters