Consolidating Per-Site S3 Buckets: The adamcherrycomics Migration to Shared Infrastructure
What Was Done
During a recent infrastructure audit of adamcherrycomics.dangerouscentaur.com, we discovered and safely decommissioned a legacy per-site S3 bucket that had become redundant after the site's migration to shared infrastructure. This post details the consolidation pattern, verification methodology, and cleanup process.
The Problem: Dual-Bucket Pattern
The adamcherrycomics site initially followed a per-site bucket architecture:
s3://adamcherrycomics.dangerouscentaur.com/— dedicated site bucket (legacy)- CloudFront distribution
E2Q4UU71SRNTMB— origin pointed to the per-site bucket - DNS:
adamcherrycomics.dangerouscentaur.comCNAME → CloudFront distribution
As the Dangerous Centaur infrastructure grew, this pattern became difficult to manage. Each new site required a new bucket, new CloudFront distribution, and new monitoring.
The Solution: Shared Bucket + Router Function
The infrastructure was refactored to use a single shared bucket with a CloudFront Function for request routing:
s3://dc-sites/— shared bucket for all Dangerous Centaur sites- Bucket structure:
dc-sites/adamcherrycomics.dangerouscentaur.com/index.html,dc-sites/adamcherrycomics.dangerouscentaur.com/about.html, etc. - CloudFront distribution
E2Q4UU71SRNTMB— single origin now points todc-sites.s3.us-east-1.amazonaws.com - CloudFront Function
dc-sites-router— rewrites incoming requests based on Host header
The router function intercepts requests and rewrites the S3 key path:
// Pseudocode: CloudFront Function logic
if (request.headers.host == "adamcherrycomics.dangerouscentaur.com") {
request.uri = "/adamcherrycomics.dangerouscentaur.com" + request.uri;
}
// Result: GET / → GET /adamcherrycomics.dangerouscentaur.com/index.html from dc-sites bucket
Verification: Why the Old Bucket Was Safe to Delete
Before deleting s3://adamcherrycomics.dangerouscentaur.com/, we performed a multi-step verification:
1. Confirm the Bucket Is Empty
Checked S3 console and API responses — Total Objects: 0. No versioned objects, no deleted-marker chains, no lifecycle policies holding data.
2. Verify It's Not on the Serving Path
Inspected CloudFront distribution E2Q4UU71SRNTMB:
- Single origin:
dc-sites.s3.us-east-1.amazonaws.com(not the per-site bucket) - No origin fallback configuration pointing to the old bucket
- No cache behaviors using alternative origins
The old bucket was completely bypassed by the serving path.
3. Check DNS Dependencies
Route53 (or external DNS provider) shows:
adamcherrycomics.dangerouscentaur.com→ CNAME todclu4nl5nln98.cloudfront.net(the CF distribution)- No CNAME to S3 website endpoint (e.g.,
adamcherrycomics.dangerouscentaur.com.s3-website-us-east-1.amazonaws.com) - No A records with S3 alias targets
The old bucket name was never referenced in DNS.
4. Confirm No IAM or Lambda Dependencies
Scanned Lambda function policies and CloudFront Function code — no references to the old bucket ARN or name. All references were to dc-sites.
Infrastructure Changes
Deletion Command
Once verified, the bucket was deleted:
aws s3 rb s3://adamcherrycomics.dangerouscentaur.com/
No --force flag needed because the bucket was empty and had no object lock or versioning preventing deletion.
Post-Deletion Testing
Immediate smoke tests to confirm no impact:
- HTTP GET to
adamcherrycomics.dangerouscentaur.com/→ 200, correct HTML served - HTTP GET to
adamcherrycomics.dangerouscentaur.com/about.html→ 200, correct page - CloudFront cache metrics — no spike in 5xx errors
- CloudFront Function logs — no unexpected rewrites or errors
Key Architectural Decisions
Why Shared Bucket + Router Function?
- Operational simplicity: One bucket to monitor, one set of lifecycle policies, one backup strategy instead of N
- Cost reduction: S3 request pricing is per-operation; shared bucket and distribution reduce per-site overhead
- Consistency: All sites use the same CORS, encryption, versioning, and logging settings
- Routing at edge: CloudFront Function rewrites are faster and cheaper than Lambda@Edge alternatives
- Multi-tenancy without cross-site access: Function logic isolates Host headers — no site can accidentally access another site's content
Why Keep the Distribution ID the Same?
The CloudFront distribution ID E2Q4UU71SRNTMB remained constant; only the origin was changed. This avoided:
- DNS propagation delays (CNAME target stays the same)
- Cache invalidation across clients
- Certificate revalidation (distribution certificate is tied to the ID)
- Monitoring/alerting reconfiguration
What's Next
- Monitor for orphaned buckets: Implement a monthly audit to find unused per-site buckets across all projects
- Document the shared pattern: This consolidation should be the template for new sites in the Dangerous Centaur portfolio
- Consider optional apex domain: If
adamcherrycomics.dangerouscentaur.comgrows significantly, we can add a bareadamcherrycomics.comapex pointing to the same distribution