Automating Multi-Platform Google Analytics Audits and Orchestrating Operational Intelligence Reports
When you manage traffic and engagement across multiple digital properties—each with different hosting infrastructures, content management systems, and deployment patterns—ensuring comprehensive analytics instrumentation becomes a sprawling operational problem. This post covers how we automated a complete GA4 audit across all properties, integrated the findings into an orchestrator-driven intelligence system, and surfaced actionable insights on a real-time kanban dashboard.
What Was Done
We executed a three-part automated workflow:
- GA Code Audit: Swept all HTML files across multiple repositories to verify Google Analytics 4 measurement ID presence and correct implementation.
- Traffic Data Aggregation: Pulled last-30-days GA4 event data across all properties using the GA4 Data API with service account authentication.
- Orchestrator Report Generation: Fed audit results and traffic data into a background orchestrator that generated structured findings, identified gaps, checked email campaign status, and produced a live dashboard card with recommendations.
The entire workflow runs asynchronously; findings surface on the progress dashboard within minutes, not hours.
Technical Details: The Audit Pipeline
GA Code Detection Across Repositories
The first challenge: we maintain code across multiple repos—/Users/cb/Documents/repos/memory, /Users/cb/Documents/repos/tools, and content directories for each property. Rather than manually reviewing each HTML template, we built a scanner that:
- Recursively finds all
*.htmlfiles in project directories - Searches for the GA4 measurement ID pattern:
gtag('config', 'G-XXXXXXXXXX') - Maps each property to its GA4 property ID
- Flags files missing instrumentation or using outdated GA3 Universal Analytics codes
- Logs results to a structured JSON report for downstream processing
The scanner checks critical entry points:
/index.htmllanding pages- Template headers in email campaign systems (to catch dynamic page generation)
- Dashboard templates (like
progress.queenofsandiego.com) - Checkout and conversion funnels (where missing GA is most costly)
Why this matters: A single untracked page in a conversion funnel means lost visibility into user drop-off. By automating detection, we catch gaps before they compound into missing revenue data.
GA4 Data API Integration with Service Account Auth
To pull programmatic traffic data, we needed to authenticate to the GA4 Data API. Rather than relying on user OAuth tokens (which expire and require manual reauthentication), we implemented service account authentication:
# Scopes required for GA4 Data API read-only access
SCOPES = ['https://www.googleapis.com/auth/analytics.readonly']
# Service account auth flow:
# 1. Load service account JSON key from secure storage
# 2. Create JWT with GA4 property ID as audience
# 3. Exchange JWT for access token (valid for 1 hour)
# 4. Use token to call runReport() on Google Analytics Data API v1
We built /Users/cb/Documents/repos/tools/reauth_ga.py to handle token refresh and API calls. Key method:
def get_ga_traffic_last_30_days(property_id):
"""
Pulls event count, sessions, and user count for last 30 days.
Args:
property_id: GA4 numeric property ID (not measurement ID)
Returns:
Dict with metrics_rows and date_ranges
"""
# Fetch token, build request, execute runReport()
Critical detail: GA4 uses two different IDs. The G-XXXXXXXXXX measurement ID goes in HTML; the numeric property ID goes in API calls. Mixing them fails silently, which we discovered during implementation.
Infrastructure: Dashboard and Orchestrator Integration
Real-Time Card Generation
Rather than dumping audit output to logs, we wanted live visibility on our kanban dashboard. The orchestrator workflow creates a card on progress.queenofsandiego.com with deep-link support:
# Card deep link format (hash-based navigation in dashboard JS)
https://progress.queenofsandiego.com/#card-t-31aa2593
# The dashboard router watches hash changes and renders:
// /Users/cb/Documents/repos/memory/feedback_dashboard_deep_links.md
// Confirms dashboard has hash navigation enabled
The orchestrator generates a structured card object:
- Title: "GA Audit + Orchestrator Report"
- Sections: GA code gaps by site, traffic recommendations, email campaign status, API access gaps, operational excellence findings
- Status: Auto-set to "needs-you" if critical gaps detected (e.g., zero GA Data API access, unapproved email blasts)
- Tags: automation, audit, infrastructure, reporting
Constant Contact Campaign Status Check
We integrated Constant Contact export CSV parsing to surface scheduled campaigns. The orchestrator:
- Reads campaign log from S3:
s3://[bucket-name]/campaign_logs/ - Detects sent vs. pending campaigns by checking contact dedup records
- Flags campaigns with past send dates but pending approval (red flag)
- Lists upcoming blast windows and requires explicit approval
Why this integration: Email campaigns often sit in unapproved state, silently missing send windows. Surfacing them on the same operational dashboard as infrastructure audits ensures nothing gets missed in async handoffs.
Key Decisions and Trade-offs
Service Account vs. OAuth for Analytics
Decision: Use service account authentication for GA4 Data API.
Reasoning:
- Service accounts don't require user login and token refresh happens automatically
- A single service account can be granted access to all GA4 properties at once (via GA Admin console)
- Tokens are scoped narrowly (analytics.readonly), reducing risk if exposed
- The account survives staff turnover—not tied to any individual user's Google account
Trade-off: Service account setup requires one-time GA Admin console configuration. But once configured, audits run hands-off daily.
Hash-Based Deep Links for Dashboard
Decision: Use hash-based URL routing for card deep links (e.g., #card-t-31aa2593).
Reasoning:
- Hash-based routing doesn't require server-side routing changes or CloudFront cache invalidation
- All navigation happens in the browser—dashboard JS already supports it
- Deep links work even if the card doesn't exist yet (useful during async orchestrator delays)
- Survives page refreshes and can be bookmarked
Implementation detail: Dashboard router in progress.queenofsandiego.com