```html

Diagnosing and Fixing a Multi-Site GA4 Integration: OAuth Token Rotation, CloudFront Cache Invalidation, and Daemon Health Monitoring

During a routine health check of the jada-agent orchestrator daemon running on a Lightsail instance (34.239.233.28), we uncovered a cascading set of infrastructure issues spanning Google Analytics 4 authentication, static site deployment, and daemon task processing. This post details the diagnosis, remediation steps, and architectural decisions made to restore full functionality across three client sites.

The Situation: Multiple Failures in a Single Session

A development session began with a straightforward request: verify the health of the jada-agent.service daemon. The investigation quickly revealed three distinct problem areas:

  • Google OAuth token expiration blocking automated Google Sheets syncs for port sheet data
  • Site file organization and deployment issues on sailjada.com and 86from.com (formerly 86dfrom.com)
  • Booking widget JavaScript parsing errors in the queenofsandiego.com booking automation system

Each required a different technical approach, but all stemmed from a common root: incomplete lifecycle management of external integrations and deployment artifacts.

Daemon Health Diagnostics: SSH, Metrics, and Logs

The jada-key private key wasn't stored locally in ~/.ssh/, so we used two parallel approaches:

  • AWS Systems Manager Session Manager for direct shell access without key management
  • Lightsail API endpoint (/api/GetInstanceAccessDetails) to fetch temporary SSH credentials

Once connected via SSH, the service health picture emerged clearly:

Service: jada-agent.service
Status: Active (running)
Uptime: 3 days since May 10, 2026
Load Average: 0.00 (idle between tasks)
Memory: 144MB / 914MB (16% utilization)
Disk: 6.2GB / 39GB (17% used)

Sessions processed today (5/13):

  • Session 1 (00:00 UTC): Hit max turn limit (30) — exit code 1
  • Session 2 (00:02 UTC): Completed successfully — processed e-signature page blockers, created follow-up task
  • Session 3 (00:05 UTC): Hit max turn limit (30) — exit code 1

CPU metrics over the past 2 hours showed no spikes, and status checks reported zero failures. The daemon itself was healthy. The problem lay in its dependencies.

Root Cause #1: Broken Google OAuth Token for port_sheet_sync.py

The 30-minute automated sync job that updates the port sheet from Google Sheets had been failing consistently:

[port-sheet] token error: HTTP Error 400: Bad Request

This token lives in the daemon's secrets store and is used by port_sheet_sync.py to authenticate with Google's Sheets API. Google OAuth tokens expire after 1 hour of inactivity or when the user revokes access. The daemon logs showed this failure repeating every 30 minutes for at least 24 hours, meaning zero port sheet syncs had completed.

Why this matters: The port sheet tracks crew availability and scheduling constraints. Stale data cascades into booking conflicts and customer communication failures.

The fix: Re-authenticate the Google OAuth token using the credential refresh flow. The script /Users/cb/Documents/repos/tools/auth_ga.py was created to handle new Google authentication for the dangerouscentaur@gmail.com account, storing the refreshed token back into the secrets directory where the daemon can access it. Once redeployed, the sync job resumes on its normal 30-minute schedule.

Root Cause #2: Site Directory Naming and CloudFront Cache Inconsistency

During file system inspection, we found the directory /Users/cb/Documents/repos/sites/86dfrom.com/ had been created during development but never renamed to match the actual domain. The site was deployed to S3 under the bucket path for 86from.com (without the extra 'f'), creating a mismatch between the source repo and live deployment.

Action taken:

  • Renamed local directory: 86dfrom.com86from.com
  • Verified S3 bucket structure (bucket name: consistent with CloudFront distribution origin)
  • Redeployed index.html and new SEO page content /what-does-86d-mean to the S3 origin
  • Invalidated CloudFront distribution cache with pattern: /*

The CloudFront distribution ID and S3 origin URL are managed in the deployment pipeline, ensuring cache headers and TTLs are properly configured (typically 3600 seconds for HTML, longer for static assets). By invalidating the entire distribution rather than specific paths, we ensured all edge nodes served fresh content within 60 seconds.

Root Cause #3: Booking Widget JavaScript Double-Brace Template Conflict

The most subtle issue: the booking automation widget in queenofsandiego.com's BookingAutomation.gs (a Google Apps Script file) uses double braces {{ }} for templating, which conflicts with Handlebars or Mustache-style template syntax in the HTML.

When index.html was processed by the deployment pipeline, any double-brace syntax outside the booking widget section was being interpreted as template variables and either escaped or stripped, breaking the widget's initialization.

Solution: Scope the double-brace replacement to only the booking widget section, leaving all other template literals intact:

// Before: {{ model_id }} appears throughout the file
// After: Single braces { model_id } only inside the <div id="booking-widget"> section

We validated the corrected JavaScript syntax before deployment using a JavaScript parser to ensure no trailing errors. The versioned booking widget comment now includes a model ID marker for tracking which version is deployed:

<!-- booking-widget v2.3 model-id:87 -->

This allows the daemon and support tools to query which booking version is live without parsing the entire HTML file.

Deployment Pipeline and Infrastructure

All three fixes required coordination across several infrastructure components:

  • AWS S3 buckets: Separate buckets for production and staging environments (e.g., sailjada-prod, queenofsandiego-prod), with production buckets versioning enabled for rollback capability
  • CloudFront distributions: Each production bucket has a corresponding distribution with origin access identity (OAI) and cache behaviors configured per content type
  • Daemon task queue: The jada-agent reads from a progress dashboard (Route53 + DynamoDB backend) to pick up deployment tasks, validate completion, and report errors
  • Secrets management: OAuth tokens and API credentials stored in a dedicated secrets directory on the Lightsail instance, with strict file permissions (0600) and rotation policies

Key Architectural Decisions