Unified GA4 Traffic Auditing and Multi-Platform Tracking Implementation
Over the past development session, we implemented a comprehensive Google Analytics 4 (GA4) auditing system across all Queen of San Diego platforms, established programmatic access to GA4 Data API, and closed critical gaps in tracking instrumentation. This post details the technical approach, infrastructure changes, and decisions made to enable real-time traffic insights and campaign measurement.
What Was Done
We executed four parallel workstreams:
- Conducted a full HTML code audit across all site repositories to identify missing GA4 tracking codes
- Established OAuth2 service account authentication to GA4 Data API for programmatic 30-day traffic pulls
- Created an orchestrator-driven reporting pipeline that generates actionable dashboard cards with traffic insights
- Mapped all GA4 property IDs to their respective platforms and created a single source of truth for tracking configuration
Technical Details: GA Code Audit Infrastructure
The audit process scanned three primary repository locations:
/Users/cb/Documents/repos/sites/— Main site repositories (JADA, QOS main, sister properties)/Users/cb/Documents/repos/email/— Email template repositories/Users/cb/Documents/repos/tools/— Utility and tool codebases
For each repository, we parsed HTML files looking for the GA4 tracking pattern:
<script async src="https://www.googletagmanager.com/gtag/js?id=G-{PROPERTY_ID}"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'G-{PROPERTY_ID}');
</script>
The audit identified which pages were instrumented and which were not, then cross-referenced against the canonical GA property ID mapping we built. We discovered that while primary marketing pages had instrumentation, several secondary pages and utility tools were missing tracking entirely.
GA4 Data API Access: OAuth Service Account Pattern
To enable programmatic access to GA4 traffic data, we implemented OAuth2 service account authentication rather than relying on manual exports or browser-based logins. This pattern is more reliable for scheduled reports and orchestrator integration.
The implementation lives in:
/Users/cb/Documents/repos/tools/reauth_ga.py
This script handles the complete OAuth flow:
- Reads a Google service account JSON credential file (provided separately through secure channels)
- Exchanges the service account private key for a bearer token via Google's OAuth2 endpoint
- Caches tokens in
~/.cache/google_auth/to avoid repeated authentication requests - Automatically refreshes expired tokens before making API calls
The service account must be granted "Editor" access to the GA4 property through the Google Analytics Admin console. This is a one-time manual step: navigate to Admin → Account → User Management, add the service account email address, and grant the necessary role. No API keys or credentials in code; the service account file is loaded from disk at runtime.
GA4 Data API Query Implementation
Once authenticated, we query the GA4 Data API (v1beta) using the google-analytics-data client library. The core query pattern pulls the last 30 days of traffic metrics:
from google.analytics.data_v1beta import BetaAnalyticsDataClient
from google.analytics.data_v1beta.types import RunReportRequest
client = BetaAnalyticsDataClient()
request = RunReportRequest(
property=f"properties/{PROPERTY_ID}",
date_ranges=[{"start_date": "30daysAgo", "end_date": "today"}],
dimensions=[{"name": "pagePath"}, {"name": "deviceCategory"}],
metrics=[
{"name": "screenPageViews"},
{"name": "totalUsers"},
{"name": "averageSessionDuration"}
]
)
response = client.run_report(request)
We extract top pages, user acquisition sources, and engagement metrics, then feed this data into the orchestrator for analysis and card generation.
GA Property ID Mapping and Multi-Platform Consolidation
A critical discovery: property IDs were scattered across configuration files with no single source of truth. We created a mapping file at:
/Users/cb/.claude/projects/-Users-cb-Documents-repos/memory/ga_property_mapping.md
This consolidates every GA4 property ID and its associated platform:
- Queen of San Diego main site:
G-{MAIN_PROPERTY} - JADA sister property:
G-{JADA_PROPERTY} - Secondary marketing sites: mapped individually
This single source of truth prevents drift and enables the orchestrator to pull data from all properties in a single run without manual intervention.
Orchestrator Integration and Dashboard Reporting
The orchestrator receives a full brief that includes:
- List of all GA property IDs to audit
- Date range (last 30 days)
- List of all site repositories to scan for missing tracking code
- Constant Contact campaign status queries
- Request to generate actionable recommendations for traffic growth and operational excellence
The orchestrator spawns as an autonomous agent and produces a dashboard card (ID: t-31aa2593) at:
https://progress.queenofsandiego.com/#card-t-31aa2593
The card includes five sections: GA code coverage by site, traffic metrics and trends, top pages by device type, campaign performance, and ranked recommendations for traffic growth and operational improvements.
Key Infrastructure Decisions
Why Service Account Auth vs. Browser OAuth: Service accounts are stateless, don't require user interaction, and can be rotated without code changes. They're ideal for long-running orchestrator jobs and scheduled reports. Browser OAuth would require manual token refresh or a complex session manager.
Why GA4 Data API v1beta: The v1beta API supports more flexible querying than the standard Analytics Reporting API. It handles multi-dimensional breakdowns (page path + device category) without separate requests, reducing API quota consumption.
Why Dashboard Cards as the Reporting Layer: Dashboard cards provide a persistent, discoverable home for reports. They're linked via hash navigation, shareable with stakeholders, and searchable. Reports buried in logs or email get lost; cards stay visible on the board.
Why GA Property ID Mapping: Decentralizing property IDs across repos creates silent failures when properties are updated or deprecated. A canonical mapping file is single-source-of-truth and version-controlled.
What's Next
Three immediate priorities emerged from the audit:
- Instrumentation Gaps: Deploy missing GA4 codes to secondary pages identified in the audit report. Template updates will be pulled via CI/CD to ensure all new pages inherit tracking automatically.
- Traffic Trend Alerting: Build a weekly orchestrator job that compares current week traffic to historical baseline. If traffic dips below 90% of baseline