Auditing and Fixing Google Analytics Cross-Domain Tracking at Scale: A Multi-Property Reconciliation
Over the past development session, we conducted a comprehensive audit of Google Analytics implementations across queenofsandiego.com and sailjada.com — two closely related properties serving a shared booking funnel. This post documents the audit methodology, the issues discovered, and the infrastructure-level fixes applied.
The Audit Challenge: Two Domains, One GA4 Property
Both sites share a single GA4 property (G-N6HKL4KLKT), which is architecturally correct for cross-domain conversion tracking. However, the implementation had drifted over time. Our audit discovered:
- ~60 pages on queenofsandiego.com had Google Analytics tags but were missing the cross-domain linker configuration
- 3 pages entirely missing GA:
charters/index.html,charters/nerissa/index.html, andthank-you/index.html - sailjada.com was clean — all 22 public pages had both GA and the linker
- Event tracking was partially implemented: conversion events (purchase, begin_checkout, generate_lead, cta_click) were tracked, but view_item events on charter detail pages and video engagement on heroes were missing
The cross-domain linker gap was the critical issue. When a user clicked from queenofsandiego.com → sailjada.com without the linker, the session broke, causing conversion attribution to shift to "direct" traffic on sailjada.com instead of properly crediting the referring domain.
Audit Methodology and Infrastructure Reconnaissance
The audit was conducted across three layers: tag presence, tag configuration, and deployment infrastructure.
Layer 1: Tag Presence Scanning
We used grep-based searches across the S3 bucket hosting queenofsandiego.com's static assets:
aws s3 ls s3://queenofsandiego-com-www/ --recursive | grep -E "\.html$" | wc -l
This identified all HTML files. We then checked each for the GA tracking snippet pattern:
grep -l "G-N6HKL4KLKT" /tmp/qos_files/*.html
Files were catalogued as: GA present / GA missing / GA present but linker missing.
Layer 2: Configuration Validation
For files with GA tags, we extracted the exact gtag('config') call patterns to check linker setup:
grep -A 5 "gtag('config'" /tmp/qos_files/*.html | grep -i "linker\|cross"
The correct pattern includes the cross-domain linker object:
gtag('config', 'G-N6HKL4KLKT', {
'linker': {
'domains': ['queenofsandiego.com', 'sailjada.com']
}
});
Pages without this configuration were flagged as broken.
Layer 3: Deployment Infrastructure Mapping
We mapped the full CDN and DNS infrastructure:
- S3 buckets:
queenofsandiego-com-www,queenofsandiego-com-root, sailjada.com equivalents - CloudFront distributions: Identified via list-distributions and cross-referenced by alias. The primary www distribution for QOS has ID
E2MLBZVR48STAG - Route53 records: Verified CNAME / A records pointed to correct CF distributions
- Subdomains: Rady Shell event pages deployed to
rady.queenofsandiego.comandcharters.queenofsandiego.com, each with dedicated CF distributions
This mapping was critical because fixes needed to be applied across multiple S3 buckets and verified through cache invalidation.
The Fix: Systematic GA Linker Injection
We wrote a Python script (/tmp/fix_ga.py) to:
- List all HTML files in the target S3 bucket
- Download each file
- Check for GA tag presence
- If GA is present but linker is missing, inject the linker config
- If GA is missing entirely, inject both the full snippet and linker
- Upload the corrected file back to S3
The script used a two-pass approach for safety: first listing all files needing changes (for manual review), then applying fixes with rollback capabilities.
# Pseudocode structure
for html_file in s3_bucket_files:
if not contains_ga():
inject_full_ga_snippet_with_linker()
elif contains_ga() and not contains_linker():
inject_linker_into_gtag_config()
upload_to_s3_with_metadata_preservation()
Key implementation details:
- Metadata preservation: We used
aws s3 cp --metadata-directive COPYto retain original Content-Type, Cache-Control, and other headers - Atomic updates: Each file was downloaded, modified in-memory, then uploaded — no intermediate temporary files left in S3
- Linker scope: The linker was configured for both apex and www domains:
'domains': ['queenofsandiego.com', 'www.queenofsandiego.com', 'sailjada.com', 'www.sailjada.com'] - Event tracking additions: While fixing, we added view_item event tracking to
charters/*/index.htmlpages (charter detail pages)
CloudFront Invalidation and Cache Busting
After uploading fixes to S3, we invalidated CloudFront caches to ensure users received the corrected files immediately:
aws cloudfront create-invalidation --distribution-id E2MLBZVR48STAG --paths "/*"
We repeated this for each affected distribution:
- Main www distribution:
E2MLBZVR48STAG - Rady Shell subdomain distribution
- Charters subdomain distribution
- sailjada.com main distribution
Invalidation took ~30 seconds per distribution to propagate globally.
Verification and Reporting
Post-fix verification included:
- Spot-checking 10 random pages from each category (detail pages, landing pages, checkout pages)
- Verifying the GA snippet was properly escaped in HTML (checking for script tag closure, no injection vulnerabilities)
- Testing the linker in a browser debugger by inspecting
_glquery parameters on cross-domain navigation