Injecting Structured Data and Analytics into Event Pages: A Multi-Site JSON-LD Deployment Strategy
This post documents the technical process of identifying missing structured data across 12 event pages, building an automated injection system, and deploying updates to multiple S3 buckets with CloudFront invalidation.
The Problem: Invisible Events in Search Results
Concert pages were being served to browsers without any structured data markup. This meant:
- Google couldn't understand event dates, ticket availability, or venue information
- Rich snippets weren't appearing in search results
- Calendar applications couldn't parse event details
- Search engines treated pages as generic content rather than time-sensitive event listings
The audit revealed zero Event and LocalBusiness JSON-LD blocks across all active concert pages in the /Users/cb/Documents/repos/sites/queenofsandiego.com/rady-shell-events/ directory structure.
Technical Solution: Automated Structured Data Injection
Script Architecture
Created /Users/cb/Documents/repos/tools/inject_structured_data.py to:
- Parse existing HTML files for event metadata (page title, date references, performer names)
- Generate valid Event schema JSON-LD blocks with required properties
- Insert markup into the document
<head>before closing tag - Preserve all existing content and formatting
- Track which pages were modified for deployment verification
The script uses this insertion pattern:
// Find the closing head tag
head_close_index = html_content.find('</head>')
// Generate JSON-LD block
event_schema = {
"@context": "https://schema.org",
"@type": "Event",
"name": extracted_event_name,
"startDate": parsed_event_date,
"location": {
"@type": "Place",
"name": venue_name,
"address": venue_address
}
}
// Insert before closing head tag
json_ld_script = f'<script type="application/ld+json">{json.dumps(event_schema)}</script>'
html_content.insert(head_close_index, json_ld_script)
Why This Approach
Rather than manually editing each file:
- Scalability: 12 pages updated in one execution; future pages can use the same pipeline
- Consistency: Every page gets identical JSON-LD structure, ensuring Google's validator accepts all pages
- Traceability: Script logs which files were modified, making rollback possible if needed
- Maintainability: Event schema updates (new required fields, format changes) happen in one place
Pages Updated
The script processed event pages across multiple Rady Shell event subdomains. Each subdomain maps to a separate S3 bucket and CloudFront distribution:
paulsimonradyshell.com→ S3 bucket:paulsimonradyshell.comsteelypanradyshell.com→ S3 bucket:steelypanradyshell.com- Additional event subdomains (8+ total event sites)
Each bucket has corresponding CloudFront distribution IDs for cache invalidation post-deployment.
Deployment Process
Step 1: S3 Sync
For each event subdomain bucket:
aws s3 sync /Users/cb/Documents/repos/sites/sailjada.queenofsandiego.com/ \
s3://paulsimonradyshell.com/ \
--exclude "*" \
--include "*.html" \
--region us-west-2
Only HTML files were synced (excluding assets already in place). This minimizes sync time and prevents unnecessary object version proliferation in S3.
Step 2: CloudFront Cache Invalidation
After S3 sync completes, invalidate edge locations to serve fresh content:
aws cloudfront create-invalidation \
--distribution-id [PAULSIMON_DIST_ID] \
--paths "/*"
The /* path pattern invalidates all objects, ensuring no stale cached versions remain. While this is broader than invalidating specific pages, it's appropriate here because:
- Event page updates are infrequent (not continuous traffic-heavy operations)
- Guarantees Google's crawler fetches updated structured data immediately
- Event schema changes are typically coordinated deployments, not incremental updates
Additional Work: Analytics Injection
While auditing pages, discovered multiple top-level and event pages missing Google Analytics tags. Updated:
/Users/cb/Documents/repos/sites/queenofsandiego.com/— QOS main pages/Users/cb/Documents/repos/sites/quickdumpnow.com/— QDN service area pages/Users/cb/Documents/repos/sites/sailjada.com/— Ranch & Coast redirect page
Modified template generators to inject GA tags at build time:
render_event_sites.py— Rady Shell event page buildergenerate_service_area_pages.py— QuickDumpNow service area template renderer
This ensures all future generated pages include analytics without manual intervention.
Infrastructure Patterns
Multi-Bucket Architecture
Each event subdomain has:
- Dedicated S3 bucket (no shared hosting)
- Dedicated CloudFront distribution (separate cache, separate invalidation control)
- Separate Route53 DNS records pointing distribution domain names
This pattern provides:
- Isolation: One event's traffic spike doesn't affect another's CloudFront cache hit ratio
- Cost tracking: S3 bandwidth and CloudFront usage attributable to specific events
- Cache strategy flexibility: Different TTLs for different events based on update frequency
Script-Based Deployment
Rather than manual file uploads, using AWS CLI with bash orchestration enables:
- Audit trails (CLI commands logged in shell history or scripts)
- Parallel deployments across multiple buckets
- Automated invalidation chaining (sync completion triggers invalidation)
- Easy rollback (revert HTML files locally, redeploy)
Key Decisions
Decision: Insert JSON-LD in <head> vs. near closing <body> tag
Placed structured data in document head because: