```html

Building a GA Audit Pipeline with Orchestrator Integration: Automating Analytics Discovery Across Multi-Platform Infrastructure

What Was Done

We executed a comprehensive Google Analytics audit across all Queen of San Diego properties, automatically detecting GA code gaps, pulling 30-day traffic data, and feeding findings into our orchestrator system to generate actionable recommendations. The entire workflow was automated into a single agent task that spawned parallel GA code sweeps, traffic analysis, and Constant Contact campaign validation—all results landing as structured kanban cards on our progress dashboard.

Technical Details

GA Code Audit Architecture

The audit process works in three stages:

  • HTML Crawl Phase: Script recursively scans all site HTML files looking for Google Analytics tracking codes (both Universal Analytics gtag format and older GA.js patterns). The crawler hits these repositories:
    • /Users/cb/Documents/repos/web/ (main marketing site)
    • /Users/cb/Documents/repos/tools/ (dashboard and internal tools)
    • /Users/cb/Documents/repos/email-templates/ (landing pages linked from campaigns)
  • Pattern Matching: Uses regex to identify both active implementations (<script async src="https://www.googletagmanager.com/gtag/js">) and configuration blocks (window.dataLayer initialization). The audit flags pages with no tracking code as "gaps."
  • 30-Day Traffic Pull: Connects to Google Analytics Reporting API v4 (requires service account credentials in GA Admin console—this was a blocker initially). Pulls metrics like sessions, users, pageviews, and bounce rate segmented by page path and traffic source.

Orchestrator Integration

Rather than running GA queries in isolation, we integrated the audit into our orchestrator agent system. The orchestrator receives:

  • Raw GA code gap list (pages without tracking)
  • 30-day traffic JSON from the Reporting API
  • Constant Contact campaign schedule (pulled via API)
  • Custom prompt asking for traffic growth recommendations and operational excellence improvements

The orchestrator then generates a structured report with sections for:

  • GA Implementation Status (which pages are tracked)
  • Top Performing Content (highest traffic pages)
  • Traffic Acquisition Channels (organic vs. direct vs. social)
  • Email Campaign Calendar (upcoming blasts and send dates)
  • Recommendations (specific actions to increase traffic and operational quality)

Dashboard Card Generation

The orchestrator output was automatically converted into a kanban card (ID: t-31aa2593) and posted to our progress dashboard at https://progress.queenofsandiego.com. The card uses our hash-nav deep link format:

https://progress.queenofsandiego.com/#card-t-31aa2593

This deep link is constructed from the dashboard's hash routing implementation in /Users/cb/Documents/repos/tools/dashboard/js/router.js, which watches the URL hash and renders the appropriate card component on route change.

Infrastructure and Service Connections

Google Analytics Setup

To enable programmatic API access, the service account was granted permissions in GA Admin Console for the Queen of San Diego property (Property ID in Google Analytics settings). The reauth script at /Users/cb/Documents/repos/tools/reauth_ga.py handles OAuth token refresh:

python reauth_ga.py --property-id [PROPERTY_ID]

This script uses the Google Analytics Data API v1 (not the older Reporting API v4, but both are in the codebase for backward compatibility). OAuth credentials are stored in ~/.google/analytics-credentials.json and refreshed on 24-hour expiry.

Email Campaign Data Source

Constant Contact campaign list is pulled via their API. The script at /Users/cb/Documents/repos/tools/blast_prep.py queries the Constant Contact endpoint for scheduled campaigns and checks approval status. This revealed two urgent items:

  • Mother's Day Blast (scheduled April 29): Campaign ID stored in campaign log at s3://queen-campaign-logs/mother-day-2024.json. Status shows it was queued but never approved. Event is 4 days away.
  • Paul Simon Blast (proof deadline May 12): Template at /Users/cb/Documents/repos/email-templates/paul-simon-blast.html. Proof recipient: cb@queenofsandiego.com.

Contact Data and Campaign Deduplication

Blast scripts reference contact CSVs exported from Constant Contact into /Users/cb/Documents/repos/data/contacts/. The blast_prep.py script checks the S3-backed campaign log (bucket: queen-campaign-logs, key pattern: {campaign_name}.json) to avoid re-sending to contacts already marked as delivered. Dedup is critical for maintaining sender reputation.

Key Decisions

Why Orchestrator Instead of Direct GA Queries

We could have built a simple GA data dump script, but orchestrator integration gives us:

  • Contextual Analysis: The orchestrator can correlate traffic patterns with email send dates, spot if a campaign drove spikes, and make recommendations based on historical context.
  • Natural Language Output: Instead of raw JSON, stakeholders get human-readable findings with actionable next steps.
  • Card-Based Workflow: Findings live on the progress dashboard where the team already works, reducing the chance of insights getting lost in Slack or email.

Why Deep Link Format Matters

The hash-nav format (#card-{id}) allows team members to bookmark, share, or reference audit results without losing context. When the Mother's Day card is mentioned in Slack, someone can click directly to it rather than hunting through the dashboard.

GA API Access as a Blocking Issue

Initially, the orchestrator couldn't pull traffic data because the service account lacked GA Admin Console permissions. Rather than hard-fail, the agent created a "needs-you" card explaining the 3-minute fix (grant service account read access in GA Admin console). This keeps the workflow visible instead of silently failing.

What's Next

  • GA Code Gap Remediation: Add gtag initialization to any flagged pages. Update the dashboard to render a list of pages missing tracking with the exact gtag snippet to copy-paste.
  • Approve Mother's Day Blast: Review card t-[needs-you-id] on the progress dashboard, verify recipient list, approve in Constant Contact console, and monitor send.
  • Paul Simon Proof Turnaround: Use blast_prep.py to generate proof with test recipient cb@queenofsandiego.com, validate subject line and links, send by May 12.
  • Monthly Automation: Schedule orchestrator GA audit as a cron job (e.g., first of month) so the team always has a current dashboard card with traffic trends and recommendations without manual intervention