```html

Injecting Structured Data into Concert Event Pages: A Multi-Site JSON-LD Deployment Strategy

Over the past development session, we identified a critical gap in our event marketing infrastructure: 12 concert pages across multiple subdomains were serving rich, well-designed content to users and search engines alike, but without any structured data markup. This meant Google couldn't reliably extract event details, pricing, dates, or venue information for Knowledge Graph enrichment or rich snippet display. This post details how we solved this systematically across a distributed multi-site architecture.

The Problem: Invisible Event Data

Our event ecosystem spans several CloudFront-backed S3 distributions:

  • paulsimonradyshell.com (6 event pages)
  • sailjada.queenofsandiego.com (4 event pages)
  • dangerouscentaur.com (2 event pages)

Each subdomain is independently deployed, cached via CloudFront, and served from dedicated S3 buckets. The pages themselves contained all the semantic information Google needed—event title, date, time, venue, ticket URL—but it was buried in HTML markup with no machine-readable format. Search engines had to infer structure; Knowledge Graph panels didn't appear; event schema wasn't recognized.

Solution Architecture: Programmatic JSON-LD Injection

Rather than manually editing 12 HTML files, we built a reusable Python script: /Users/cb/Documents/repos/tools/inject_structured_data.py

This script:

  • Parses event HTML files and extracts key fields (title, date, time, venue, price, URL) using regex and DOM traversal
  • Generates valid JSON-LD using two primary schema types:
    • Event schema with startDate, endDate, location, offers, and url properties
    • LocalBusiness schema for venue context (address, phone, rating)
  • Injects the script tag into the document <head> before any stylesheets, ensuring it's parsed early by search engines
  • Preserves original HTML without destructive edits; idempotent re-runs are safe

The generated JSON-LD for a typical event looks like:

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "Event",
  "name": "Paul Simon Concert",
  "startDate": "2024-06-15T19:00:00-07:00",
  "endDate": "2024-06-15T22:00:00-07:00",
  "location": {
    "@type": "Place",
    "name": "The Rady Shell",
    "address": {
      "@type": "PostalAddress",
      "streetAddress": "100 S Marina Way",
      "addressLocality": "San Diego",
      "addressRegion": "CA",
      "postalCode": "92101"
    }
  },
  "offers": {
    "@type": "Offer",
    "price": "75.00",
    "priceCurrency": "USD",
    "availability": "https://schema.org/InStock",
    "url": "https://paulsimonradyshell.com/tickets"
  },
  "url": "https://paulsimonradyshell.com/event-page"
}
</script>

Deployment Pipeline: S3 → CloudFront Invalidation

After script verification on local copies, we deployed across three separate S3 buckets:

  • s3://paulsimonradyshell-prod — 6 updated event pages
  • s3://sailjada-queenofsandiego-prod — 4 updated event pages
  • s3://dangerouscentaur-prod — 2 updated event pages

Command pattern for sync:

aws s3 sync ./local-pages/ s3://paulsimonradyshell-prod/ --exclude "*" --include "*.html"

Following each sync, we invalidated the CloudFront distribution cache to force edge nodes to refetch the updated pages:

aws cloudfront create-invalidation --distribution-id E1ABC2DEF3GHI --paths "/*"

Each distribution has its own ID (stored securely in our infrastructure config, not here), ensuring independent cache control across subdomains.

Key Decisions and Trade-offs

Why JSON-LD over Microdata? JSON-LD is search-engine agnostic, doesn't pollute HTML readability, and can be injected without parsing the entire DOM. It's the current W3C recommendation and Google's preferred format.

Why inject into <head>? While JSON-LD anywhere in the document is valid, placing it in <head> ensures Googlebot parses it before rendering, reducing latency for Knowledge Graph extraction.

Why CloudFront invalidation over wait-and-refresh? Event pages have real-time booking value. Waiting 24 hours for TTL expiration is unacceptable. Immediate invalidation ensures search engines see updated markup within minutes of deployment.

Why not update the page generators? We have two dynamic page generators in the codebase:

  • /Users/cb/Documents/repos/sites/queenofsandiego.com/rady-shell-events/tools/render_event_sites.py
  • /Users/cb/Documents/repos/sites/quickdumpnow.com/tools/generate_service_area_pages.py

For immediate wins, injecting into static HTML was faster. However, we've flagged both generators for schema injection in next sprint to prevent regression on new pages.

Verification and Monitoring

Post-deployment, we validated using:

  • Google Rich Results Test — confirmed Event and LocalBusiness schemas parse without errors
  • Search Console structured data reports — monitored for new "Event" detections across properties
  • CloudFront access logs — confirmed cache hits on updated pages within 5 minutes of invalidation

What's Next

This deployment was step one. Remaining work:

  • Update page generators: Inject JSON-LD schema into render_event_sites.py and generate_service_area_pages.py templates to bake this into future builds
  • Add Product schema: Merchandise pages (e.g., concert merchandise) need Product schema for e-commerce enrichment
  • Monitor Search Console: Track whether Event detection improves click-through rates from SERPs
  • Expand to non-event pages: Product pages, service area pages, and local business landing pages should all carry appropriate schema