```html

Automating Charter Readiness Reports: Building a Multi-Stage Document Pipeline for JADA Operations

What Was Done

Over this development session, I built an end-to-end automation pipeline for generating and publishing charter readiness documentation for JADA weekend operations. The system now:

  • Queries the JADA internal calendar API to extract charter bookings for specific weekends
  • Generates structured HTML manifests and trip sheets from charter data
  • Publishes documents to S3 with CloudFront CDN distribution
  • Maintains version control through local markdown documentation
  • Implements OAuth token refresh for secure API access

The pipeline was validated using the Quinn Male charter (scheduled for 2026-05-29) as a test case, with documents now published to the production S3 bucket for crew access.

Technical Architecture

Data Flow

The system follows a three-stage pipeline:

JADA Calendar API → Python Data Processing → HTML/Markdown Generation → S3 Publication

Stage 1 fetches raw charter data from the internal calendar using OAuth credentials. Stage 2 transforms this data into structured documents using Jinja2 templating. Stage 3 uploads to S3 with proper CloudFront cache invalidation.

Source Data Integration

The JADA calendar system exposes charter events through a REST API endpoint. Rather than hard-coding charter details, the pipeline queries this source of truth, ensuring any changes made in the calendar (payment status, crew assignments, passenger manifests) automatically reflect in generated documents. This eliminates manual sync work and reduces documentation debt.

OAuth token refresh was implemented to handle expired credentials gracefully. The system stores refresh tokens separately from access tokens, allowing long-lived automation without requiring manual re-authentication between runs.

Document Generation Pipeline

HTML Manifest Format

Trip manifests follow a two-section structure:

  • Header section: Charter metadata (date, vessel, captain, crew assignments, passenger count)
  • Manifest section: Passenger details in tabular format with name, contact, dietary restrictions, medical notes

The HTML output is semantically structured with proper heading hierarchy and data table markup, making documents screen-reader accessible and easy to parse programmatically if needed later.

Template System

Templates are stored in /Users/cb/Documents/repos/shipcaptaincrew/templates/ and use Jinja2 syntax. This allows separating presentation logic from data transformation code:

{% for passenger in manifest.passengers %}
  <tr>
    <td>{{ passenger.name }}</td>
    <td>{{ passenger.contact }}</td>
    <td>{{ passenger.dietary_notes }}</td>
  </tr>
{% endfor %}

New document formats can be added by creating new templates without touching the data fetching logic—a clean separation of concerns.

Infrastructure and Storage

S3 Bucket Structure

Documents are published to the JADA operations S3 bucket with this directory structure:

s3://jada-operations-bucket/
├── charters/
│   ├── 2026-05-29/
│   │   ├── quinn-male-manifest.html
│   │   ├── quinn-male-trip-sheet.html
│   │   └── metadata.json
│   └── [other dates]/
└── archives/
    └── [historical charters]

This structure enables:

  • Quick lookup of documents by charter date
  • Atomic updates (publish to dated directory, then symlink to "latest")
  • Archive preservation for compliance and historical reference
  • CloudFront cache invalidation by path pattern

CloudFront Distribution

The S3 bucket is served through CloudFront distribution (ID stored in ops infrastructure config) with:

  • Origin: S3 bucket in us-west-2 (collocated with JADA servers)
  • Default TTL: 3600 seconds (1 hour) for manifest updates to propagate quickly
  • Compression: Enabled for HTML/JSON to reduce bandwidth
  • Cache invalidation: Triggered after each publish using path patterns like /charters/2026-05-29/*

This CDN setup ensures crew members accessing documents from various locations receive cached copies from nearby edge locations, with automatic fallback to origin on cache miss or TTL expiry.

Authentication and Security

The pipeline uses two separate credential types:

  • Calendar API: OAuth 2.0 with refresh token flow stored in ~/.jada/credentials/calendar-oauth.json
  • AWS: IAM role credentials via STS assume-role, avoiding long-lived access keys

The OAuth implementation handles token expiry by checking the token's expires_at timestamp before each API call. If expired, it automatically refreshes using the stored refresh token before proceeding. This prevents pipeline failures due to stale credentials.

AWS credentials are refreshed using STS (Security Token Service) with temporary credentials that expire after 1 hour. The pipeline re-authenticates when needed rather than storing permanent keys.

Local Project Structure

The development environment is organized as:

/Users/cb/Documents/repos/
├── jada-ops/
│   └── weekend-charters-readiness-2026-05-29.md
├── agent_handoffs/
│   └── projects/
│       └── quinn-male-charter.md
└── shipcaptaincrew/
    ├── tools/
    │   ├── manifest_generator.py
    │   ├── calendar_client.py
    │   └── s3_publisher.py
    ├── templates/
    │   ├── manifest.html
    │   └── trip-sheet.html
    └── scripts/
        └── publish_charter_docs.py

Markdown files in jada-ops/ track readiness status and blockers. The agent_handoffs/ directory maintains notes for team communication. The actual implementation lives in shipcaptaincrew/ where the tools are modular and reusable across different charter operations.

Key Design Decisions

Why Query the Calendar as Source of Truth

Rather than duplicating charter data in a separate database, the system queries the existing JADA calendar API. This means:

  • No data sync overhead—calendar is always the source of truth
  • Changes made by schedulers in the calendar UI automatically propagate to documents
  • Reduced storage and maintenance complexity

Why S3 + CloudFront Instead of Direct Serving

S3 provides:

  • Durability: 99.999999999% object durability (11 9s)
  • Versioning: Automatic retention of previous manifest versions
  • Access logs: Audit trail of document downloads for compliance