Diagnosing and Remediating the JADA Agent Daemon: OAuth Token Expiration and Turn Limit Analysis

```html

On May 13, 2026, a health check of the JADA orchestrator daemon running on AWS Lightsail instance 34.239.233.28 revealed a healthy service with one critical authentication failure and a recurring operational pattern worth documenting. This post covers the diagnostic approach, findings, and remediation path.

Service Health Status: Mostly Operational

The jada-agent.service systemd unit has been running continuously since May 10 with 11 days of underlying instance uptime. Key metrics:

CPU utilization: 0.65% average over the polling window, with no spike anomalies
Memory consumption: 144MB of 914MB available
Disk usage: 6.2GB of 39GB (17%), leaving ample headroom for logs and task artifacts
Status checks: Zero failures in the prior 2-hour window
Load average: 0.00, indicating the daemon is idle between task executions

The instance is stable. The daemon's 60-second polling loop accounts for the minimal CPU footprint.

Session Activity and Turn Limit Behavior

Today's activity log shows three sessions executed within the first five minutes of the UTC day:

Session 1 (00:00 UTC): Hit max turn limit (30 turns) and exited with code 1
Session 2 (00:02 UTC): Completed successfully; processed e-signature and crew page blockers, created a needs-you task
Session 3 (00:05 UTC): Hit max turn limit again; exited with code 1

After session 3, the daemon found no pending tasks in the progress dashboard and resumed normal idle polling. This pattern is important: the "max turns" exits are not daemon crashes or service failures. They are expected terminations when a Claude agent session reaches its 30-turn conversation limit. The daemon logs these as error-level exits (code 1) but continues normal operation on the next polling cycle.

Why this matters: If task scope is expanding such that complex work requires more than 30 turns to complete, we have two remediation paths: either increase the turn limit per session, or decompose larger tasks into smaller, sequential subtasks that fit within a single session's budget.

Critical Issue: Broken Google OAuth Token in port_sheet_sync

The port_sheet_sync.py script, which syncs port sheet data to Google Sheets every 30 minutes, has been failing consistently since at least early afternoon UTC with the same error:

[port-sheet] token error: HTTP Error 400: Bad Request

This indicates the stored Google OAuth token for the service account or user account is expired or has been revoked. No port sheet syncs have executed since the failures began.

Root cause: Google OAuth2 access tokens have a default lifetime of 3600 seconds (1 hour). The token stored in the daemon's credential cache has expired. The refresh token may also be invalid, or the credential file itself may have been corrupted during a deployment or manual intervention.

Affected component: The token is stored in the jada agent's credential cache, likely in a JSON file referenced by the `port_sheet_sync.py` script during initialization.

Diagnostic Approach: SSH Access via Lightsail API

The private key for the jada-key SSH key pair was not available in the local ~/.ssh/ directory. Rather than recreate or redeploy the key, we used the AWS Lightsail API to generate temporary SSH credentials:

# Pseudocode: Get temporary access details from Lightsail API
aws lightsail get-instance-access-details \
  --instance-name jada-agent \
  --region us-east-1

# Response contains temporary certificate and private key, valid for 60 seconds
# Write temp key to file and immediately SSH
ssh -i /tmp/jada_temp_key.pem \
  -o StrictHostKeyChecking=accept-new \
  ubuntu@34.239.233.28

This approach avoids long-term key storage on the development machine and follows the principle of least privilege—temporary credentials are generated on-demand and discarded after the session.

Data Collected via SSH

Once connected, we collected:

Service status: systemctl status jada-agent.service
Recent logs: journalctl -u jada-agent.service -n 100 --no-pager
Process info: ps aux | grep jada-agent
System metrics: free -h, df -h, uptime
Daemon logs with error context: Last 2 hours of stderr/stdout from the daemon's log file

All data was collected without modifying the system or interrupting service.

Remediation Path

To resolve the port_sheet_sync failure:

Identify the credential file path used by port_sheet_sync.py (likely in /home/ubuntu/.jada_creds/ or similar)
Run the Google OAuth re-authentication flow: python3 auth_ga.py --account [service-account-email]
Verify the new token is written and that the script can read it
Trigger a manual sync: python3 port_sheet_sync.py --force
Monitor the next three 30-minute cycles for errors in the daemon logs

For the turn limit behavior, review recent session 1 and 3 task descriptions to determine if they can be split into smaller, sequentially-dependent tasks, or if the turn limit should be increased in the daemon's configuration.

Key Decisions

Why Lightsail API for SSH access: Eliminates the need to manage long-lived SSH keys on development machines; reduces attack surface.
Why we didn't restart the service: The service is healthy and performing its intended function. Restarting would interrupt any in-progress work.
Why we logged max-turn exits but didn't escalate: These are design-level signals, not errors. They tell us task scope is hitting session limits; the daemon recovers normally.

What's Next

The daemon is stable and ready for continued operation. Priority actions are to re-authenticate the Google OAuth token for port sheet syncs and to review the turn limit behavior in task design. No infrastructure changes are required at this time.

```