Deploying a Receipt Management System for quickdumpnow.com: S3, CloudFront, and Custom Error Handling

Overview

This post documents the deployment of a dedicated receipt management page for a trailer rental business operating under quickdumpnow.com. The work involved creating a new content path, managing S3 object versioning strategies, troubleshooting CloudFront's custom error responses, and ensuring proper URL routing through robots.txt configuration.

What Was Done

We deployed a receipt upload and management interface at https://quickdumpnow.com/books by:

Creating and validating an HTML receipt form at /Users/cb/Documents/repos/sites/quickdumpnow.com/books/index.html
Uploading the page to S3 using dual-key strategy for pretty URLs
Configuring robots.txt to block search engine indexing of the /books path
Invalidating CloudFront distribution caches to ensure immediate propagation
Diagnosing and working around CloudFront's custom error response behavior

Technical Details: S3 Deployment Strategy

The quickdumpnow.com infrastructure uses an S3 bucket (name withheld for security) as the origin for CloudFront distribution. When deploying the books page, we faced a common challenge: supporting "pretty" URLs without filename extensions.

The solution involved uploading the same compiled HTML content to two S3 keys:

books/index.html — Traditional directory structure
books — Bare key for direct path access

This dual-key approach ensures that both /books/ and /books serve the correct content without redirects. The reasoning: CloudFront and S3 don't automatically strip trailing slashes or resolve directory index files like a traditional web server would. By providing both keys, we guarantee coverage for user agents that may or may not append trailing slashes.

Upload command pattern used:

aws s3 cp books/index.html s3://[bucket-name]/books/index.html --content-type "text/html"
aws s3 cp books/index.html s3://[bucket-name]/books --content-type "text/html"

Both uploads set explicit Content-Type: text/html headers to prevent S3 from misinterpreting the object type.

Infrastructure: CloudFront Cache Invalidation

After uploading to S3, we invalidated the CloudFront distribution to force edge servers to fetch fresh content. CloudFront distributions serving quickdumpnow.com required invalidation of two paths:

/books/* — Wildcard to catch all /books paths
/robots.txt — Exact match for the updated robots configuration

Invalidation pattern:

aws cloudfront create-invalidation --distribution-id [DIST_ID] --paths "/books/*" "/robots.txt"

CloudFront invalidations are typically processed within 60 seconds, though the status can be monitored via the AWS Console or CLI. We confirmed propagation by checking the ETag and Age headers in response metadata.

The CloudFront Custom Error Response Problem

During testing, we discovered that https://quickdumpnow.com/books was returning the homepage instead of the books page. This behavior stems from CloudFront's custom error response configuration.

The distribution was configured with:

Error code: 404
Response path: /index.html
HTTP response code: 200 (mask the error)

This configuration is intended to support single-page applications (SPAs) by rewriting 404s to the homepage. However, it creates a false positive: if an S3 object doesn't exist (returns 404), the error handler kicks in and serves the homepage instead.

Why this happened: The books page wasn't yet uploaded when the first request came in, so S3 returned 404, CloudFront's error response intercepted it, and the user saw the homepage.

Solution: Ensure S3 objects exist before traffic reaches them. We verified both S3 keys (books and books/index.html) were present and had correct Content-Type headers before invalidating CloudFront caches.

We also checked the distribution origin configuration:

aws cloudfront get-distribution-config --id [DIST_ID]

This confirmed the origin points to the correct S3 bucket and that no path-based routing rules were interfering.

robots.txt Configuration

We updated /Users/cb/Documents/repos/sites/quickdumpnow.com/robots.txt to block search engine crawlers from indexing the receipt management page:

User-agent: *
Disallow: /books

Rationale: The books/receipts page is internal business documentation, not public-facing content. Preventing indexing reduces noise in search results and avoids potential data exposure if receipts contain sensitive information.

The robots.txt file itself was uploaded to S3 at the root key and invalidated in CloudFront to ensure crawlers fetch the updated version immediately.

Architecture Pattern: Static File Delivery at Scale

The infrastructure follows a standard CDN + Origin pattern:

Origin: S3 bucket (regional, highly available)
CDN: CloudFront (global edge locations, sub-100ms latency)
Cache Control: Headers set via S3 metadata or CloudFront behaviors
Error Handling: Custom 404 responses for SPA support

This pattern scales to millions of requests without additional infrastructure. S3's 99.99% uptime SLA and CloudFront's automatic failover ensure reliability.

Key Decisions

Dual S3 keys instead of redirects: Faster, no extra HTTP roundtrips, simpler CloudFront rules.
Block via robots.txt, not HTTP auth: Simpler to manage, doesn't require credential rotation, sufficient for business use case.
Custom error response kept in place: Other routes may depend on SPA-style 404→index.html behavior; removing it could break existing functionality.
Explicit Content-Type headers: Prevents S3 from guessing MIME types, ensures browsers render HTML correctly.

What's Next

The receipt page is now live at https://quickdumpnow.com/books. Future work includes:

Implementing backend receipt upload handling (Lambda function or API Gateway + DynamoDB)
Adding authentication/authorization to restrict access to business operators
Integrating receipt processing (image recognition, metadata extraction)
Setting up CloudWatch logs to monitor 404 rates and error response triggers
Configuring S3 lifecycle policies to archive old receipts to Glacier