Fixing OAuth Token Expiration in Google Cloud: Moving from Testing Mode to Production
We've been stuck in a recurring nightmare: every seven days, our Google OAuth tokens expire and require manual re-authentication. The surface-level fix (re-running auth scripts) works temporarily, but the root cause has persisted through multiple attempted fixes. After diagnosing the issue across our Lambda functions, Apps Script deployments, and GCP configuration, we finally identified and resolved the fundamental problem: our Google Cloud OAuth application was locked in Testing mode, which enforces a hard seven-day refresh token expiration window.
The Problem: Testing Mode Token Expiration
Our booking automation and calendar sync systems authenticate with Google APIs through a single OAuth 2.0 application configured in Google Cloud. The application handles multiple scopes across several services:
- Google Calendar API (crew dispatch scheduling, booking confirmations)
- Gmail API (booking notifications, customer outreach)
- Google Sheets API (dashboard state management)
- Google Apps Script Execution API (remote trigger of setup functions)
When an OAuth application is in Testing mode in Google Cloud Console, refresh tokens are explicitly limited to a seven-day lifetime. Once that window closes, the token becomes invalid regardless of whether you have valid credentials or proper scopes. This forces users into the OAuth consent screen repeatedly, breaking automated workflows.
Previous "fixes" in our codebase (located in /Users/cb/Documents/repos/tools/reauth_jada_all.py) were band-aids: they re-ran the full OAuth flow to issue new tokens, but the underlying seven-day clock kept ticking. We needed to move the application to Production mode, which removes the refresh token expiration entirely (tokens remain valid until explicitly revoked).
Technical Diagnosis: Where Tokens Were Breaking
We traced token usage across three primary systems:
- AWS Lambda (calendar sync trigger): Stored token in Lambda environment variables; checked via
aws lambda get-function-configfor `GOOGLE_CALENDAR_TOKEN` - Lightsail instance (crew dispatch cron jobs): Tokens synced to
/home/ubuntu/tokens/; verified via direct SSH inspection - Google Apps Script deployment (booking automation web app): Uses Script Properties for persistent token storage; deployed via
clasp push && clasp deploy
Each system was independently re-authing when tokens expired, creating cascading sync failures across booking platforms (Boatsetter, Viator, Sailo). The root cause wasn't poor token distribution—it was that all tokens were being issued with the seven-day limit.
Moving to Production: The GCP Configuration Change
Google Cloud's OAuth consent screen has two publication states:
- Testing: Limited to 100 test users; refresh tokens expire in 7 days; meant for development
- Production: Unlimited users; refresh tokens valid indefinitely; requires application verification
The fix requires navigating Google Cloud Console → APIs & Services → OAuth consent screen, then clicking the Publish to Production button. This is a UI-only operation; there's no API endpoint to automate it. However, before publishing, the application must have:
- A valid application name and description (not "My Test App")
- Authorized domains configured (your domain where the web app lives)
- Privacy policy and terms of service URLs (even if placeholder URLs initially)
- Scopes limited to only what's actually needed (we scope to Calendar, Gmail, Sheets, and Apps Script Execution)
Our configuration already had these in place, so the transition was straightforward: one console click, then reissuing tokens with the new Production-mode application.
Token Reissue and Deployment Pipeline
Once the application was published to Production, we issued a fresh token using our existing OAuth flow (handled by a local Python script with your personal browser-based consent). The new token was then distributed across all three systems:
1. Lambda environment update:
aws lambda update-function-configuration \
--function-name CalendarSync \
--environment Variables={GOOGLE_CALENDAR_TOKEN=$NEW_TOKEN,GOOGLE_CALENDAR_REFRESH_TOKEN=$NEW_REFRESH}
2. Lightsail token sync:
scp -i ~/.ssh/lightsail_key.pem \
~/tokens/token.pickle \
ubuntu@lightsail-host:/home/ubuntu/tokens/
3. Apps Script deployment:
Apps Script stores sensitive credentials in Script Properties (project-level configuration storage). We updated the token via clasp by running a setup function that stores the token, then deployed a new version:
clasp run calendarSyncSetup --params '["TOKEN_STRING"]'
clasp deploy --description "v99: Updated token for Production OAuth"
This ensures that if someone accesses the deployed web app URL (the custom domain serving the booking automation), the app uses the new Production-mode token for any Google API calls.
Why This Matters for Your Architecture
Your booking system spans multiple platforms (Boatsetter, Viator, Sailo, GetMyBoat) and pulls availability data from your JADA Internal calendar via iCal feeds. The calendar is the source of truth. When OAuth broke every seven days:
- The Lambda calendar sync stopped running (token invalid)
- iCal feeds became stale
- External platforms didn't reflect new bookings or cancellations
- Manual blocking (e.g., in Viator's admin console) had to substitute for automation
By moving to Production mode, the refresh token is now indefinitely valid. The automation layer no longer has an artificial expiration date. Your crews can confirm dates with confidence because calendar state is reliably synchronized.
Key Decision: Why Not Use Service Accounts?
Service accounts (which don't use refresh tokens) were considered but rejected because your workflows require user-level Gmail and Calendar access—specifically, your personal email inbox for booking notifications and your personal calendar for scheduling. Service accounts operate with a different identity and can't read your personal email or calendar without shared access, adding complexity and permission management overhead.
OAuth with user credentials is the right pattern here. Production mode simply removes the arbitrary seven-day expiration.
What's Next: Monitoring and Hardening
The token should now remain valid indefinitely, but we should add monitoring:
- Log token issue timestamps in Lambda to track lifespan
- Set CloudWatch alarms if calendar sync Lambda fails (indicating potential token revocation)
- Document the Production mode status in your infrastructure runbook so future engineers don't regress to Testing mode
We should also harden the token storage itself—currently tokens are stored in plaintext in Lambda environment variables and on the Lightsail instance. A longer-term improvement would be AWS Secrets Manager for encrypted storage with rotation policies, but that's a separate infrastructure refactor.
The seven-day curse is finally broken.