```html

Injecting Structured Data into Concert Event Pages: A Multi-Site JSON-LD Strategy

When you have 12 concert event pages spread across multiple subdomains with zero structured data markup, search engines can't understand what you're advertising. This post walks through how we automated JSON-LD injection across the JADA concert subdomain infrastructure, deployed to S3, and cache-invalidated across CloudFront distributions.

The Problem: Dark Content for Search Engines

Before this work, pages like sailjada.queenofsandiego.com/events/[event-name].html contained beautiful HTML with event details, artist bios, and ticket information—but no machine-readable metadata. Google's crawler could extract some signals from the DOM, but we were leaving performance on the table:

  • No Event schema for rich snippets in search results
  • No LocalBusiness schema linking to venue information
  • No Organization schema establishing brand authority
  • Rich results eligibility: zero

The decision to inject structured data came from a CMO-level visibility audit that identified distribution as the blocker. Search visibility is distribution. Structured data is a force multiplier for organic reach.

Solution Architecture: Automated JSON-LD Injection Pipeline

Rather than manually editing 12 HTML files, we built a Python script that:

  1. Reads each event HTML file from the local repository
  2. Parses event metadata (title, date, artist, venue, price)
  3. Generates schema.org-compliant JSON-LD for Event + LocalBusiness
  4. Injects the script tag into the <head> before closing
  5. Writes the updated file back to disk

This approach ensures consistency across all concert pages and makes future schema updates trivial—just update the script and re-run.

Technical Implementation

The Injection Script

File: /Users/cb/Documents/repos/tools/inject_structured_data.py

This script targets all concert event pages across JADA subdomains. The key logic:


# Pseudo-code structure
for event_file in event_pages:
    html_content = read(event_file)
    event_data = extract_metadata(html_content)
    
    json_ld_event = {
        "@context": "https://schema.org",
        "@type": "Event",
        "name": event_data['title'],
        "startDate": event_data['date'],
        "endDate": event_data['date'],
        "location": {
            "@type": "Place",
            "name": event_data['venue'],
            "address": event_data['venue_address']
        },
        "offers": {
            "@type": "Offer",
            "url": event_data['ticket_url'],
            "price": event_data['price'],
            "priceCurrency": "USD"
        },
        "organizer": {
            "@type": "Organization",
            "name": "JADA",
            "url": "https://sailjada.queenofsandiego.com"
        }
    }
    
    inject_into_head(html_content, json_ld_event)
    write_updated_file(event_file, html_content)

The script searches for the closing </head> tag and inserts a <script type="application/ld+json"> block immediately before it. This placement ensures the structured data is parsed before the page fully renders.

Event Pages Processed

We identified and updated 12 pages across the JADA concert subdomain structure:

  • /Users/cb/Documents/repos/sites/sailjada.queenofsandiego.com/ (primary concert domain)
  • Files follow pattern: [concert-name].html with embedded metadata in data attributes or comments
  • All pages hosted on S3 bucket: sailjada-qos-events-prod (inferred from deployment commands)

Deployment Strategy: S3 + CloudFront Invalidation

S3 Bucket Sync

After local injection, we synced all updated concert pages to S3:


# Example sync (no credentials shown)
aws s3 sync /Users/cb/Documents/repos/sites/sailjada.queenofsandiego.com/ \
  s3://sailjada-qos-events-prod/ \
  --exclude "*" \
  --include "*.html" \
  --acl public-read

We used --include "*.html" to target only the event pages, avoiding unnecessary uploads of CSS, JS, or other assets.

CloudFront Cache Invalidation

S3 alone isn't enough. The JADA concert subdomain is fronted by CloudFront for edge caching and HTTPS termination. We identified the distribution ID and invalidated the cache for all modified pages:


aws cloudfront create-invalidation \
  --distribution-id [DISTRIBUTION_ID] \
  --paths "/events/*.html" "/*.html"

The invalidation pattern /events/*.html covers all concert event pages, ensuring they serve fresh content with the new structured data within seconds across all edge locations.

Why This Architecture?

1. Separation of Concerns

The injection script lives in /tools/, not the site repository. This keeps the script reusable across multiple projects (QOS, QDN, DC) if needed and makes it easy to version control independently.

2. Automated Over Manual

Manually editing 12 HTML files introduces human error and doesn't scale. If we need to add Event schema attributes (e.g., eventStatus, eventAttendanceMode) later, running the script takes seconds. Doing it by hand takes hours and risks inconsistency.

3. CloudFront Distribution ID Strategy

By invalidating specific paths rather than the entire distribution, we minimize blast radius and reduce cache churn. Event pages get fresh structured data immediately, while static assets (logo, CSS, JS) remain cached.

4. Schema.org Compliance

We used Event + LocalBusiness + Organization schema because:

  • Event schema: Tells Google this is a ticketed event with a date, venue, and price. Enables event rich results in search.
  • LocalBusiness schema: Connects the venue to local signals (address, phone, reviews). Helps with local pack visibility.
  • Organization schema: Establishes JADA's identity across pages. Supports knowledge panel eligibility.

Key Decisions & Trade-offs

  • Inline JSON-LD vs. External File: We chose inline <script> tags in the <head> for simplicity and zero additional HTTP requests. External files would require an extra roundtrip.
  • Injection Point (<head> vs. <body>): The <head> ensures parsing before the page renders. Google