```html

Auditing and Fixing Google Analytics Cross-Domain Tracking at Scale: A Multi-Property Reconciliation

Over the past development session, we conducted a comprehensive audit of Google Analytics implementations across queenofsandiego.com and sailjada.com — two closely related properties serving a shared booking funnel. This post documents the audit methodology, the issues discovered, and the infrastructure-level fixes applied.

The Audit Challenge: Two Domains, One GA4 Property

Both sites share a single GA4 property (G-N6HKL4KLKT), which is architecturally correct for cross-domain conversion tracking. However, the implementation had drifted over time. Our audit discovered:

  • ~60 pages on queenofsandiego.com had Google Analytics tags but were missing the cross-domain linker configuration
  • 3 pages entirely missing GA: charters/index.html, charters/nerissa/index.html, and thank-you/index.html
  • sailjada.com was clean — all 22 public pages had both GA and the linker
  • Event tracking was partially implemented: conversion events (purchase, begin_checkout, generate_lead, cta_click) were tracked, but view_item events on charter detail pages and video engagement on heroes were missing

The cross-domain linker gap was the critical issue. When a user clicked from queenofsandiego.com → sailjada.com without the linker, the session broke, causing conversion attribution to shift to "direct" traffic on sailjada.com instead of properly crediting the referring domain.

Audit Methodology and Infrastructure Reconnaissance

The audit was conducted across three layers: tag presence, tag configuration, and deployment infrastructure.

Layer 1: Tag Presence Scanning

We used grep-based searches across the S3 bucket hosting queenofsandiego.com's static assets:

aws s3 ls s3://queenofsandiego-com-www/ --recursive | grep -E "\.html$" | wc -l

This identified all HTML files. We then checked each for the GA tracking snippet pattern:

grep -l "G-N6HKL4KLKT" /tmp/qos_files/*.html

Files were catalogued as: GA present / GA missing / GA present but linker missing.

Layer 2: Configuration Validation

For files with GA tags, we extracted the exact gtag('config') call patterns to check linker setup:

grep -A 5 "gtag('config'" /tmp/qos_files/*.html | grep -i "linker\|cross"

The correct pattern includes the cross-domain linker object:

gtag('config', 'G-N6HKL4KLKT', {
  'linker': {
    'domains': ['queenofsandiego.com', 'sailjada.com']
  }
});

Pages without this configuration were flagged as broken.

Layer 3: Deployment Infrastructure Mapping

We mapped the full CDN and DNS infrastructure:

  • S3 buckets: queenofsandiego-com-www, queenofsandiego-com-root, sailjada.com equivalents
  • CloudFront distributions: Identified via list-distributions and cross-referenced by alias. The primary www distribution for QOS has ID E2MLBZVR48STAG
  • Route53 records: Verified CNAME / A records pointed to correct CF distributions
  • Subdomains: Rady Shell event pages deployed to rady.queenofsandiego.com and charters.queenofsandiego.com, each with dedicated CF distributions

This mapping was critical because fixes needed to be applied across multiple S3 buckets and verified through cache invalidation.

The Fix: Systematic GA Linker Injection

We wrote a Python script (/tmp/fix_ga.py) to:

  1. List all HTML files in the target S3 bucket
  2. Download each file
  3. Check for GA tag presence
  4. If GA is present but linker is missing, inject the linker config
  5. If GA is missing entirely, inject both the full snippet and linker
  6. Upload the corrected file back to S3

The script used a two-pass approach for safety: first listing all files needing changes (for manual review), then applying fixes with rollback capabilities.

# Pseudocode structure
for html_file in s3_bucket_files:
  if not contains_ga():
    inject_full_ga_snippet_with_linker()
  elif contains_ga() and not contains_linker():
    inject_linker_into_gtag_config()
  upload_to_s3_with_metadata_preservation()

Key implementation details:

  • Metadata preservation: We used aws s3 cp --metadata-directive COPY to retain original Content-Type, Cache-Control, and other headers
  • Atomic updates: Each file was downloaded, modified in-memory, then uploaded — no intermediate temporary files left in S3
  • Linker scope: The linker was configured for both apex and www domains: 'domains': ['queenofsandiego.com', 'www.queenofsandiego.com', 'sailjada.com', 'www.sailjada.com']
  • Event tracking additions: While fixing, we added view_item event tracking to charters/*/index.html pages (charter detail pages)

CloudFront Invalidation and Cache Busting

After uploading fixes to S3, we invalidated CloudFront caches to ensure users received the corrected files immediately:

aws cloudfront create-invalidation --distribution-id E2MLBZVR48STAG --paths "/*"

We repeated this for each affected distribution:

  • Main www distribution: E2MLBZVR48STAG
  • Rady Shell subdomain distribution
  • Charters subdomain distribution
  • sailjada.com main distribution

Invalidation took ~30 seconds per distribution to propagate globally.

Verification and Reporting

Post-fix verification included:

  • Spot-checking 10 random pages from each category (detail pages, landing pages, checkout pages)
  • Verifying the GA snippet was properly escaped in HTML (checking for script tag closure, no injection vulnerabilities)
  • Testing the linker in a browser debugger by inspecting _gl query parameters on cross-domain navigation