Building a Real-Time Technical Blog Pipeline: Auto-Generated Session Transcripts to CloudFront

```html

Over this session, I built a complete technical blog infrastructure for four domain properties that auto-generates granular technical posts from Claude Code session transcripts. The system captures every file modification, command execution, and decision made during development work—then publishes it immediately to dedicated tech blog subdomains visible from the main site navigation.

The Problem We Solved

Stakeholders needed visibility into what work was being performed across multiple ship-related web properties. High-level summaries don't cut it—we needed detailed, technical documentation that shows exact file paths, AWS resource changes, infrastructure decisions, and command sequences. The previous approach would have required manual documentation after the fact. Instead, we built an automated pipeline that generates posts from the actual session data.

Architecture Overview

The system has four main components:

Session Capture: Claude Code's session data (JSONL transcript files) stored in ~/.claude/projects/
Generator: Python script that parses transcripts and creates HTML blog posts
Deployment: AWS S3 buckets with CloudFront distributions and DNS routing
Trigger: Stop hook that runs when a Claude Code session ends, generating and publishing the post

Infrastructure Provisioning

We created four parallel tech blog deployments:

tech.queenofsandiego.com — CloudFront + Route53, wildcard cert *.queenofsandiego.com
tech.sailjada.com — CloudFront + Route53, wildcard cert *.sailjada.com
tech.dangerouscentaur.com — CloudFront + Namecheap DNS, using existing wildcard CF distribution on dc-sites S3 bucket
tech.burialsatseasandiego.com — CloudFront + GoDaddy DNS, new S3 bucket and distribution

Each deployment follows the same pattern:

S3 Bucket (tech-blogs-[domain])
  ↓
CloudFront Distribution (CNAME: tech.[domain])
  ↓
DNS Provider (Route53 or external)
  ↓
Public HTTPS endpoint

For Route53-managed domains, we used the AWS CLI to create alias records pointing to CloudFront distributions. For external DNS providers (Namecheap, GoDaddy), we configured CNAME records. ACM certificates were validated via DNS CNAME records placed in each provider's control panel.

The Generator: Converting Transcripts to Posts

The core logic lives in /Users/cb/Documents/repos/tools/tech_blog_generator.py. This script:

Reads JSONL-formatted Claude Code session transcripts from ~/.claude/projects/[project-name]/memory/sessions/
Extracts file modifications (reads from write and edit events)
Extracts command executions (reads from command events)
Extracts tool usage (reads from tool_use events, filtering out sensitive data)
Generates structured HTML with <h2>, <h3>, and <code> tags
Strips all credentials, API keys, tokens, and secrets using regex patterns

The generator identifies the source domain based on the session context (stored in project memory files) and publishes to the appropriate S3 bucket and CloudFront distribution.

The Stop Hook: Automation on Session End

A critical piece is the Stop hook at /Users/cb/.claude/hooks/tech_blog_stop.sh. This bash script:

Runs automatically when a Claude Code session ends
Calls the blog generator with the current session transcript
Uploads generated HTML to the appropriate tech blog S3 bucket
Invalidates the CloudFront cache to ensure immediate visibility
Logs all activity to ~/.claude/logs/tech_blog_stop.log

The hook was registered in /Users/cb/.claude/settings.json under the hooks.session_stop configuration. This ensures every session automatically generates a post without manual intervention.

Navigation Integration

We updated the Ship's Papers navigation menu in /Users/cb/Documents/repos/sites/queenofsandiego.com/index.html to include a "Tech Blog" link that routes to https://tech.queenofsandiego.com/. This makes technical documentation discoverable to stakeholders checking the main site.

Key Technical Decisions

Why CloudFront + S3? We chose this pattern for low-latency global delivery, automatic HTTPS termination, and built-in caching. CloudFront also gives us instant cache invalidation when posts update.

Why separate blogs per domain? Each property (Sail Jada, Queen of San Diego, Dangerous Centaur, Burials at Sea San Diego) has different stakeholders and contexts. Separate tech blogs prevent noise and keep each audience focused on relevant work.

Why JSONL transcripts? Claude Code stores session data in line-delimited JSON, which is machine-readable, timestamped, and already structured with event types (file writes, commands, tool uses). Parsing this gives us a single source of truth.

Credential Stripping: We implemented regex-based filtering to remove AWS keys, API tokens, passwords, and sensitive configuration data before publication. This allows Sergio to see exactly what work was done without exposing secrets in a public blog.

Additional Fixes

While building this system, we identified and created tracking items for:

Incorrect images on the Burials at Sea San Diego fleet page—marked for image replacement
Minimum guest requirements that needed clarification across multiple email templates
GA4 audit recommendations (separate analysis in progress)

What's Next

The system is now live. Every session that touches any of these four domains will automatically generate a granular technical blog post within seconds of the session ending. Sergio can visit any tech blog subdomain, see a timeline of exactly what was modified, which AWS resources changed, and why those decisions were made.

Future enhancements could include RSS feeds for each tech blog, cross-domain aggregation views, and automated alerting for certain types of infrastructure changes. For now, the system provides real-time transparency into development work across all four properties.

```