Building a Dynamic Charter Document Pipeline: From Calendar Events to Live S3 Manifests

```html

Over the past development session, I implemented an end-to-end document generation and publishing pipeline for JADA Operations' weekend charter events. The system extracts charter data from a Google Calendar API, generates structured HTML manifests and trip sheets, publishes them to S3, and invalidates CloudFront caches to ensure live content consistency across the crew management portal.

What Was Done

The primary objective was to automate the creation and publication of charter documents—specifically passenger manifests and trip sheets—that needed to be available on the queenofsandiego.com crew portal within minutes of event confirmation. Previously, this was a manual process prone to delays and formatting inconsistencies.

Fetched weekend charter events from the JADA Google Calendar using OAuth tokens
Parsed calendar event data to extract passenger names, contact information, and charter details
Generated formatted HTML manifests and trip sheets with consistent styling
Published generated documents to multiple S3 locations for redundancy
Invalidated CloudFront distribution caches to ensure immediate live availability
Integrated document provisioning into Lambda functions for event-driven processing

Technical Architecture

Calendar Integration

The system begins by querying the JADA Google Calendar API for events within a specific date range (in this case, the upcoming weekend). The implementation uses OAuth token refresh logic to maintain authentication across session boundaries:

import requests
from google.auth.transport.requests import Request
from google.oauth2.service_account import Credentials

# Calendar API queries retrieve all events with structured metadata
# Event objects contain booking details, passenger lists, and charter specifics
# Token refresh handled automatically by the Google auth library

The calendar serves as the source of truth for charter information. Each event contains structured custom fields with passenger names, contact phone numbers, and charter-specific metadata. This avoids maintaining a separate database and keeps operations teams working in their existing tools.

Document Generation

Charter manifests are generated as static HTML files with embedded styling. The manifest template (/Users/cb/Documents/repos/jada-ops/quinn-male/quinn-male-manifest.html) includes:

Vessel information and charter date/time
Captain and crew assignments
Complete passenger manifest with names and phone numbers
Trip sheet with location coordinates and timing information
CSS styling for print-friendly formatting

The generation process is straightforward—extract calendar event data, populate HTML templates with passenger information, and write completed documents to local filesystem before publishing:

# charter_provisioner.py handles manifest generation
# Pattern: calendar_event → extract_fields() → render_template() → write_html()

# Example structure from generated manifests:
# - Passenger names extracted from calendar event description
# - Phone numbers parsed from custom event fields
# - Vessel details populated from charter metadata
# - CSS includes print styles for crew operations

S3 Publishing Strategy

Documents are published to two distinct S3 locations to support different access patterns:

Primary Location: s3://shipcaptaincrew/docs/crew-page/{event_id}/ — Documents linked from the crew management portal SPA
Secondary Location: s3://shipcaptaincrew/snapshots/print/{charter_name}/ — Backup copies for durability and historical reference

Both locations use content-type headers set to text/html to ensure browsers render documents directly rather than prompting downloads. The S3 bucket policy allows public read access, enabling direct URL sharing with crew members and passengers.

Publishing is handled through the AWS SDK with explicit credential refresh to avoid session timeout issues:

# send_charter_emails.py demonstrates the publishing pattern
import boto3

s3_client = boto3.client('s3', region_name='us-west-2')

# Publish manifest to crew-page docs
s3_client.put_object(
    Bucket='shipcaptaincrew',
    Key=f'docs/crew-page/{event_id}/manifest.html',
    Body=manifest_html,
    ContentType='text/html'
)

# Publish to backup location
s3_client.put_object(
    Bucket='shipcaptaincrew',
    Key=f'snapshots/print/{charter_name}/manifest.html',
    Body=manifest_html,
    ContentType='text/html'
)

CloudFront Cache Invalidation

The critical piece that enables live document delivery is CloudFront cache invalidation. The queenofsandiego.com domain is backed by CloudFront distribution E2Y7EXAMPLE (exact ID redacted), which caches all S3 content with a 24-hour default TTL.

Without explicit invalidation, updated manifests would take up to 24 hours to appear live. The solution is to trigger CloudFront invalidation immediately after S3 upload:

cloudfront = boto3.client('cloudfront', region_name='us-east-1')

invalidation_response = cloudfront.create_invalidation(
    DistributionId='DISTRIBUTION_ID',
    InvalidationBatch={
        'Paths': {
            'Quantity': 2,
            'Items': [
                f'/docs/crew-page/{event_id}/manifest.html',
                f'/docs/crew-page/{event_id}/trip-sheet.html'
            ]
        },
        'CallerReference': str(time.time())
    }
)

The invalidation uses specific path patterns rather than wildcard invalidations (which carry quota penalties). CallerReference is set to the current timestamp to ensure uniqueness and proper AWS tracking.

Lambda Integration

The document pipeline is integrated into /Users/cb/Documents/repos/sites/queenofsandiego.com/tools/shipcaptaincrew/lambda_function.py, which serves as the event handler for crew page document requests. When users navigate to a crew event page, the Lambda function:

Queries the calendar API for the specific event
Checks if cached documents exist in S3
Generates manifests on-demand if missing
Returns document URLs to the frontend SPA for display

This lazy-generation approach reduces redundant document creation while ensuring documents are always fresh when explicitly requested by crew members.

Key Architectural Decisions

Why Google Calendar as the source of truth? The operations team already maintains charter schedules in Google Calendar with custom fields for passenger details. Using the calendar API eliminates the need for custom database infrastructure and keeps the system aligned with existing workflows.

Why dual S3 locations? The crew-page docs location is directly referenced by the web portal frontend, while the snapshots location provides historical archives and backup copies. This redundancy protects against accidental deletion and supports compliance/record-keeping requirements.

Why explicit CloudFront invalidation? Cache invalidation is more reliable than relying on TTL expiration for time-sensitive operational documents. Crew members need current manifests immediately, not eventual consistency hours later.

Why static HTML over dynamic rendering? Pre-generated manifests load instantly and are trivial to cache. Dynamic rendering would add latency and complexity during high-traffic periods (multiple crews accessing documents simultaneously before charter departure).

Implementation Details

The complete