Automating Google Analytics Audits and Multi-Platform Tracking: Building a GA Code Coverage System with Orchestrator Automation

Maintaining consistent analytics instrumentation across multiple platforms is a classic operational challenge. When you're managing content across different domains, static sites, and dynamic applications, ensuring every single page reports back to Google Analytics becomes a coordination problem that doesn't scale with manual checks. This post covers how we implemented an automated GA code audit system integrated with our orchestrator infrastructure to catch tracking gaps, identify missing API permissions, and generate actionable reports without manual intervention.

The Problem: Analytics Blind Spots at Scale

Queen of San Diego operates content across multiple platforms and domains. Each platform—marketing sites, campaign pages, email-driven content—needs to report traffic data back to our primary Google Analytics property. Without systematic verification, pages inevitably slip through without tracking code, creating blind spots in our traffic data. We needed:

  • Automated scanning of all HTML files across all deployed sites to verify GA code presence
  • Detection of missing Universal Analytics tags, GA4 tags, or event tracking implementation
  • Programmatic access to historical traffic data for the previous 30 days
  • Integration with our orchestrator to generate actionable reports without manual dashboarding
  • Identification of operational gaps (missing API permissions, unapproved campaigns, upcoming deadlines)

Technical Architecture: Multi-Component Orchestration

Rather than building a single monolithic script, we decomposed this into discrete components that our orchestrator agent could coordinate:

Component 1: GA Code Audit Scanner

The first component performs a recursive scan across all deployed HTML sources. The scanner targets three primary file locations:

  • /var/www/queenofsandiego.com/public_html/ — main domain static files
  • ~/Documents/repos/marketing-site/dist/ — Next.js build output for marketing properties
  • S3 bucket: qosd-static-assets in us-west-2 region — campaign landing pages and email-driven content

The scanner searches for these specific patterns in each HTML file:

gtag.config('GA_MEASUREMENT_ID')
ga('create', 'UA-XXXXXXXXX-X')
_gaq.push(['_trackPageview'])
gtag('event', ...)
analytics.google.com/analytics/web/

Why this multi-pattern approach? We support legacy Universal Analytics (UA) properties, modern Google Analytics 4 (GA4) implementations, and custom event tracking. Each pattern indicates a different generation of analytics code, and we need to detect all of them to ensure complete coverage.

Component 2: Google Analytics Data API Client

The second component attempts programmatic access to GA reporting data using the Google Analytics Data API v1 (the modern replacement for the deprecated Reporting API). This requires:

  • A service account with JSON keyfile stored in our secrets manager
  • Proper IAM role assignment in Google Cloud Console
  • Grant of "Editor" or "Analyst" role on the GA4 property in Google Analytics Admin console

The audit discovered we hadn't yet granted the service account access to the GA4 property. This is a common operational gap—the service account exists in GCP, but the GA property itself needs explicit permission assignment in the GA Admin UI (Settings > Property > Property Access Management > Service Account Lookup).

Component 3: Email Campaign Status Aggregator

Integration with Constant Contact via their API to pull:

  • Scheduled email campaigns with their approval status
  • Campaign send dates and timezone handling
  • Proof-of-concept approval workflows and pending items

This identified that the Mother's Day blast (scheduled for April 29) was still unapproved with only 4 days until send, and the Paul Simon event campaign needed proof review by May 12.

Infrastructure: Orchestrator Integration and Report Generation

Rather than executing these components sequentially through manual scripts, we integrated them into our orchestrator agent system. Here's the execution flow:

Agent: "GA audit + orchestrator report"
├── Spawn: GA Code Scanner
│   ├── Read: /var/www/*/public_html/*.html
│   ├── Read: S3://qosd-static-assets/*
│   ├── Parse:  and