```html

Publishing Charter Documents to S3 and Invalidating CloudFront Cache: A Real-World Case Study

During a weekend charter operations session, we needed to publish dynamically-generated charter manifests and trip sheets to a public-facing S3 bucket, then invalidate CloudFront cache to ensure immediate availability. This post walks through the exact infrastructure decisions, command patterns, and debugging workflow we used to solve document distribution at scale.

What Was Done

We generated two HTML documents for a charter booking (Quinn Male charter):

  • /tmp/quinn-male-manifest.html — Passenger manifest with names, contact info, and manifest details
  • /tmp/quinn-male-trip-sheet.html — Crew-facing trip sheet with operational details

These documents were then:

  1. Published to S3 bucket shipcaptaincrew-docs under the docs/ prefix
  2. Verified live via HTTPS at predictable URLs
  3. Cached invalidation triggered on CloudFront distribution shipcaptaincrew-cdn
  4. Durably backed up to a local repository path for future reference

Technical Architecture and Infrastructure

Document Storage Structure

The shipcaptaincrew project uses a multi-tier document storage strategy:

  • S3 Bucket: shipcaptaincrew-docs (private, CloudFront-only access)
  • CloudFront Distribution: shipcaptaincrew-cdn (public-facing CDN)
  • Document Prefixes:
    • docs/ — Crew-page associated documents (manifests, trip sheets, crew assignments)
    • charters/ — Legacy charter-specific document storage
  • Local Backup: /Users/cb/Documents/repos/jada-ops/{charter-name}/ (durability/audit trail)

This separation between CloudFront-served documents and local backups provides both performance and long-term auditability.

Why This Architecture?

Rather than serving documents directly from S3 or from the application origin, we use CloudFront as the distribution layer because:

  • Geographic distribution: Manifest/trip sheet links sent via SMS to captains/crew across multiple locations benefit from edge caching
  • Cost efficiency: CloudFront data transfer costs are significantly lower than direct S3 egress
  • Decoupling: Application Lambda functions don't need to handle static document serving; CloudFront origin failover handles it
  • Security: S3 bucket can be entirely private; no public-read ACLs needed

Publishing Workflow and Commands

Step 1: Authentication and Session Management

Before any S3 operations, we ensured AWS credentials were active:

$ aws sts get-caller-identity
# Returned account ID, user ARN, confirming active session

The AWS session is managed via environment variables set from credential files in ~/.aws/credentials (not included in version control). This avoids hardcoding credentials in scripts.

Step 2: Uploading Documents to S3

We uploaded the manifest and trip sheet using the AWS CLI with explicit content-type headers:

$ aws s3 cp /tmp/quinn-male-manifest.html \
  s3://shipcaptaincrew-docs/docs/quinn-male-manifest.html \
  --content-type "text/html; charset=utf-8" \
  --acl private

$ aws s3 cp /tmp/quinn-male-trip-sheet.html \
  s3://shipcaptaincrew-docs/docs/quinn-male-trip-sheet.html \
  --content-type "text/html; charset=utf-8" \
  --acl private

Key decisions:

  • --acl private ensures the S3 objects themselves are not public-readable; CloudFront OAI (Origin Access Identity) provides the only access path
  • --content-type text/html prevents browsers from downloading HTML as an attachment
  • Explicit charset avoids encoding mismatches for international charter names/crew
  • docs/ prefix aligns with how the Lambda backend queries documents (see build_event_pages in the SPA)

Step 3: Verifying Live Content

After upload, we verified both documents were accessible via the CloudFront distribution:

$ curl -I https://shipcaptaincrew.s3.amazonaws.com/docs/quinn-male-manifest.html
# HTTP/1.1 403 Forbidden (expected; direct S3 access blocked)

$ curl -I https://cdn.shipcaptaincrew.com/docs/quinn-male-manifest.html
# HTTP/1.1 200 OK (correct; accessed via CloudFront)

$ curl https://cdn.shipcaptaincrew.com/docs/quinn-male-manifest.html | grep "Quinn Male"
# Confirmed passenger names appear in live content

The 403 on direct S3 access is intentional—it confirms the bucket policy restricts access to CloudFront only.

Step 4: CloudFront Cache Invalidation

Once documents were live, we invalidated the CloudFront cache to ensure immediate propagation:

$ aws cloudfront create-invalidation \
  --distribution-id E1A2B3C4D5E6F7 \
  --paths "/docs/quinn-male-manifest.html" "/docs/quinn-male-trip-sheet.html"

# Returned invalidation ID: I2F3E4D5C6B7A8

Why invalidate rather than wait for TTL expiry?

  • Charter operations are time-sensitive; a captain needs the manifest immediately, not in 24 hours
  • Invalidation cost is negligible for occasional charter operations (free tier covers 3,000/month)
  • TTL for /docs/* is set to 3600 seconds (1 hour), but invalidation ensures zero-wait propagation
  • Wildcards in invalidation paths (e.g., /docs/*) would invalidate all charter documents, defeating cache efficiency; we invalidate only the specific files changed

Step 5: Local Durability Backup

After confirming live availability, we backed up documents to the local repository:

$ cp /tmp/quinn-male-manifest.html \
  /Users/cb/Documents/repos/jada-ops/quinn-male/quinn-male-manifest.html

$ cp /tmp/quinn-male-trip-sheet.html \
  /Users/cb/Documents/repos/jada-ops/quinn-male/quinn-male-trip-sheet.html