Deploying a Receipt Management System for quickdumpnow.com: S3, CloudFront, and Custom Error Handling
Overview
This post documents the deployment of a dedicated receipt management page for a trailer rental business operating under quickdumpnow.com. The work involved creating a new content path, managing S3 object versioning strategies, troubleshooting CloudFront's custom error responses, and ensuring proper URL routing through robots.txt configuration.
What Was Done
We deployed a receipt upload and management interface at https://quickdumpnow.com/books by:
- Creating and validating an HTML receipt form at
/Users/cb/Documents/repos/sites/quickdumpnow.com/books/index.html - Uploading the page to S3 using dual-key strategy for pretty URLs
- Configuring robots.txt to block search engine indexing of the /books path
- Invalidating CloudFront distribution caches to ensure immediate propagation
- Diagnosing and working around CloudFront's custom error response behavior
Technical Details: S3 Deployment Strategy
The quickdumpnow.com infrastructure uses an S3 bucket (name withheld for security) as the origin for CloudFront distribution. When deploying the books page, we faced a common challenge: supporting "pretty" URLs without filename extensions.
The solution involved uploading the same compiled HTML content to two S3 keys:
books/index.html— Traditional directory structurebooks— Bare key for direct path access
This dual-key approach ensures that both /books/ and /books serve the correct content without redirects. The reasoning: CloudFront and S3 don't automatically strip trailing slashes or resolve directory index files like a traditional web server would. By providing both keys, we guarantee coverage for user agents that may or may not append trailing slashes.
Upload command pattern used:
aws s3 cp books/index.html s3://[bucket-name]/books/index.html --content-type "text/html"
aws s3 cp books/index.html s3://[bucket-name]/books --content-type "text/html"
Both uploads set explicit Content-Type: text/html headers to prevent S3 from misinterpreting the object type.
Infrastructure: CloudFront Cache Invalidation
After uploading to S3, we invalidated the CloudFront distribution to force edge servers to fetch fresh content. CloudFront distributions serving quickdumpnow.com required invalidation of two paths:
/books/*— Wildcard to catch all /books paths/robots.txt— Exact match for the updated robots configuration
Invalidation pattern:
aws cloudfront create-invalidation --distribution-id [DIST_ID] --paths "/books/*" "/robots.txt"
CloudFront invalidations are typically processed within 60 seconds, though the status can be monitored via the AWS Console or CLI. We confirmed propagation by checking the ETag and Age headers in response metadata.
The CloudFront Custom Error Response Problem
During testing, we discovered that https://quickdumpnow.com/books was returning the homepage instead of the books page. This behavior stems from CloudFront's custom error response configuration.
The distribution was configured with:
- Error code: 404
- Response path:
/index.html - HTTP response code: 200 (mask the error)
This configuration is intended to support single-page applications (SPAs) by rewriting 404s to the homepage. However, it creates a false positive: if an S3 object doesn't exist (returns 404), the error handler kicks in and serves the homepage instead.
Why this happened: The books page wasn't yet uploaded when the first request came in, so S3 returned 404, CloudFront's error response intercepted it, and the user saw the homepage.
Solution: Ensure S3 objects exist before traffic reaches them. We verified both S3 keys (books and books/index.html) were present and had correct Content-Type headers before invalidating CloudFront caches.
We also checked the distribution origin configuration:
aws cloudfront get-distribution-config --id [DIST_ID]
This confirmed the origin points to the correct S3 bucket and that no path-based routing rules were interfering.
robots.txt Configuration
We updated /Users/cb/Documents/repos/sites/quickdumpnow.com/robots.txt to block search engine crawlers from indexing the receipt management page:
User-agent: *
Disallow: /books
Rationale: The books/receipts page is internal business documentation, not public-facing content. Preventing indexing reduces noise in search results and avoids potential data exposure if receipts contain sensitive information.
The robots.txt file itself was uploaded to S3 at the root key and invalidated in CloudFront to ensure crawlers fetch the updated version immediately.
Architecture Pattern: Static File Delivery at Scale
The infrastructure follows a standard CDN + Origin pattern:
- Origin: S3 bucket (regional, highly available)
- CDN: CloudFront (global edge locations, sub-100ms latency)
- Cache Control: Headers set via S3 metadata or CloudFront behaviors
- Error Handling: Custom 404 responses for SPA support
This pattern scales to millions of requests without additional infrastructure. S3's 99.99% uptime SLA and CloudFront's automatic failover ensure reliability.
Key Decisions
- Dual S3 keys instead of redirects: Faster, no extra HTTP roundtrips, simpler CloudFront rules.
- Block via robots.txt, not HTTP auth: Simpler to manage, doesn't require credential rotation, sufficient for business use case.
- Custom error response kept in place: Other routes may depend on SPA-style 404→index.html behavior; removing it could break existing functionality.
- Explicit Content-Type headers: Prevents S3 from guessing MIME types, ensures browsers render HTML correctly.
What's Next
The receipt page is now live at https://quickdumpnow.com/books. Future work includes:
- Implementing backend receipt upload handling (Lambda function or API Gateway + DynamoDB)
- Adding authentication/authorization to restrict access to business operators
- Integrating receipt processing (image recognition, metadata extraction)
- Setting up CloudWatch logs to monitor 404 rates and error response triggers
- Configuring S3 lifecycle policies to archive old receipts to Glacier