Injecting Structured Data Into Concert Event Pages: Automating JSON-LD Deployment Across Multi-Subdomain Infrastructure
When you're managing 12+ event subdomains across multiple S3 buckets and CloudFront distributions, manual structured data insertion becomes untenable. This post walks through how we automated Event and LocalBusiness schema injection into concert pages, deployed them globally, and why this infrastructure pattern matters for SEO and discoverability.
The Problem: Zero Structured Data on 12 Concert Pages
An audit revealed that despite maintaining beautifully formatted concert event pages across the Rady Shell Events subdomain network, none of them included JSON-LD structured data. For search engines and rich snippet generation, this was leaving SEO performance on the table.
- 12 active concert pages with no Event or LocalBusiness schema
- Pages distributed across 6+ event subdomains (each with separate S3 buckets)
- Manual editing would require touching each file individually
- CloudFront cache invalidation needed for all changes to propagate
The solution required three components: a Python script to inject schema, infrastructure discovery to map buckets/distributions, and a batch deployment strategy.
Architecture: Script-Driven Infrastructure Changes
File Structure and Injection Points
The Rady Shell Events pages live in this structure:
/Users/cb/Documents/repos/sites/queenofsandiego.com/rady-shell-events/
├── tools/
│ └── render_event_sites.py
├── events/
│ ├── [event-name]/
│ │ └── index.html
│ └── [more events]/
└── README
Each event subdomain (like paulsimonradyshell.queenofsandiego.com) has its own S3 bucket and CloudFront distribution. The challenge: identifying which pages needed injection, then coordinating deployment across all buckets without collision.
Structured Data Injection Script
We created /Users/cb/Documents/repos/tools/inject_structured_data.py to handle the injection logic. The script:
- Scans all event HTML files for existing schema tags
- Generates Event and LocalBusiness JSON-LD blocks based on page metadata (title, date, venue)
- Injects the schema into the
<head>section before the closing tag - Logs which files were modified for audit purposes
The script identifies injection points by looking for the first closing </head> tag and inserts the JSON-LD block immediately before it. This ensures the schema is parsed before page rendering without interfering with existing meta tags or link elements.
# Example structured data injection pattern:
# Scan for: </head>
# Insert before it: <script type="application/ld+json">{...}</script>
Infrastructure: Multi-Bucket Deployment Strategy
S3 Bucket Inventory
Each event subdomain has a dedicated S3 bucket. For example:
paulsimonradyshell-queenofsandiego-com— Paul Simon event pagessailjada-queenofsandiego-com— General JADA events- Plus 4 additional event-specific buckets
We mapped these buckets by querying the Route53 records for each subdomain and cross-referencing with the CloudFront distribution origins. This is critical: deploying to the wrong bucket breaks the site.
CloudFront Cache Invalidation Strategy
After uploading updated HTML files to each S3 bucket, we needed to invalidate CloudFront caches. Each subdomain has its own CloudFront distribution ID:
paulsimonradyshell.queenofsandiego.com— Distribution ID:E[XXXX]sailjada.queenofsandiego.com— Distribution ID:E[YYYY]- Additional distributions for other event subdomains
Rather than invalidate everything (/*), we invalidated only the modified pages:
aws cloudfront create-invalidation \
--distribution-id E[DIST_ID] \
--paths "/paulsimonradyshell/index.html" "/events/event-name/index.html"
This approach is faster and reduces CloudFront API costs. Full invalidations can take 10-15 minutes; targeted invalidations typically complete in 2-5 minutes.
Key Decisions and Trade-offs
Why Automate with a Python Script Instead of Hardcoding?
A one-off manual process would work once. But this event portfolio grows: new concerts get added monthly. A reusable Python script in /repos/tools/ ensures future events automatically get schema injection without engineer intervention. It's the same reason we maintain render_event_sites.py and generate_service_area_pages.py — tooling compounds over time.
JSON-LD vs. Microdata vs. RDFa
We chose JSON-LD because:
- No HTML markup pollution — schema lives entirely in a
<script>block - Google recommends it for Event schema
- Easier to parse and validate than embedded microdata
- Doesn't interfere with CSS or JavaScript that might depend on HTML structure
Injection Into Head vs. Body
We inject into the <head> section because search engines prioritize schema found before the body. This is semantic — structured data about the page's content belongs with other metadata, not interspersed with content elements.
Deployment Execution
The actual deployment followed this sequence:
- Audit Phase: Checked all 12 event pages; confirmed zero existing schema
- Script Execution: Ran inject_structured_data.py against the event directory
- Validation: Spot-checked 3 files to verify schema syntax and head tag injection
- Upload Phase: Synced modified files to each S3 bucket using AWS CLI
- Cache Invalidation: Created CloudFront invalidations for each distribution
- Verification: Checked live URLs with Google's Structured Data Testing Tool
Total affected files: 12 concert pages. Total deployment time: ~20 minutes (including invalidation wait times).
What's Next
This pattern now extends to other site families. We've identified similar schema-injection opportunities on:
- Main QOS pages (
queenofsandiego.com) - Quick Dump Now service area pages (generated by
generate_service