Building a Real-Time Technical Blog Pipeline: Auto-Generated Session Transcripts to CloudFront
Over this session, I built a complete technical blog infrastructure for four domain properties that auto-generates granular technical posts from Claude Code session transcripts. The system captures every file modification, command execution, and decision made during development work—then publishes it immediately to dedicated tech blog subdomains visible from the main site navigation.
The Problem We Solved
Stakeholders needed visibility into what work was being performed across multiple ship-related web properties. High-level summaries don't cut it—we needed detailed, technical documentation that shows exact file paths, AWS resource changes, infrastructure decisions, and command sequences. The previous approach would have required manual documentation after the fact. Instead, we built an automated pipeline that generates posts from the actual session data.
Architecture Overview
The system has four main components:
- Session Capture: Claude Code's session data (JSONL transcript files) stored in
~/.claude/projects/ - Generator: Python script that parses transcripts and creates HTML blog posts
- Deployment: AWS S3 buckets with CloudFront distributions and DNS routing
- Trigger: Stop hook that runs when a Claude Code session ends, generating and publishing the post
Infrastructure Provisioning
We created four parallel tech blog deployments:
- tech.queenofsandiego.com — CloudFront + Route53, wildcard cert
*.queenofsandiego.com - tech.sailjada.com — CloudFront + Route53, wildcard cert
*.sailjada.com - tech.dangerouscentaur.com — CloudFront + Namecheap DNS, using existing wildcard CF distribution on
dc-sitesS3 bucket - tech.burialsatseasandiego.com — CloudFront + GoDaddy DNS, new S3 bucket and distribution
Each deployment follows the same pattern:
S3 Bucket (tech-blogs-[domain])
↓
CloudFront Distribution (CNAME: tech.[domain])
↓
DNS Provider (Route53 or external)
↓
Public HTTPS endpoint
For Route53-managed domains, we used the AWS CLI to create alias records pointing to CloudFront distributions. For external DNS providers (Namecheap, GoDaddy), we configured CNAME records. ACM certificates were validated via DNS CNAME records placed in each provider's control panel.
The Generator: Converting Transcripts to Posts
The core logic lives in /Users/cb/Documents/repos/tools/tech_blog_generator.py. This script:
- Reads JSONL-formatted Claude Code session transcripts from
~/.claude/projects/[project-name]/memory/sessions/ - Extracts file modifications (reads from
writeandeditevents) - Extracts command executions (reads from
commandevents) - Extracts tool usage (reads from
tool_useevents, filtering out sensitive data) - Generates structured HTML with
<h2>,<h3>, and<code>tags - Strips all credentials, API keys, tokens, and secrets using regex patterns
The generator identifies the source domain based on the session context (stored in project memory files) and publishes to the appropriate S3 bucket and CloudFront distribution.
The Stop Hook: Automation on Session End
A critical piece is the Stop hook at /Users/cb/.claude/hooks/tech_blog_stop.sh. This bash script:
- Runs automatically when a Claude Code session ends
- Calls the blog generator with the current session transcript
- Uploads generated HTML to the appropriate tech blog S3 bucket
- Invalidates the CloudFront cache to ensure immediate visibility
- Logs all activity to
~/.claude/logs/tech_blog_stop.log
The hook was registered in /Users/cb/.claude/settings.json under the hooks.session_stop configuration. This ensures every session automatically generates a post without manual intervention.
Navigation Integration
We updated the Ship's Papers navigation menu in /Users/cb/Documents/repos/sites/queenofsandiego.com/index.html to include a "Tech Blog" link that routes to https://tech.queenofsandiego.com/. This makes technical documentation discoverable to stakeholders checking the main site.
Key Technical Decisions
Why CloudFront + S3? We chose this pattern for low-latency global delivery, automatic HTTPS termination, and built-in caching. CloudFront also gives us instant cache invalidation when posts update.
Why separate blogs per domain? Each property (Sail Jada, Queen of San Diego, Dangerous Centaur, Burials at Sea San Diego) has different stakeholders and contexts. Separate tech blogs prevent noise and keep each audience focused on relevant work.
Why JSONL transcripts? Claude Code stores session data in line-delimited JSON, which is machine-readable, timestamped, and already structured with event types (file writes, commands, tool uses). Parsing this gives us a single source of truth.
Credential Stripping: We implemented regex-based filtering to remove AWS keys, API tokens, passwords, and sensitive configuration data before publication. This allows Sergio to see exactly what work was done without exposing secrets in a public blog.
Additional Fixes
While building this system, we identified and created tracking items for:
- Incorrect images on the Burials at Sea San Diego fleet page—marked for image replacement
- Minimum guest requirements that needed clarification across multiple email templates
- GA4 audit recommendations (separate analysis in progress)
What's Next
The system is now live. Every session that touches any of these four domains will automatically generate a granular technical blog post within seconds of the session ending. Sergio can visit any tech blog subdomain, see a timeline of exactly what was modified, which AWS resources changed, and why those decisions were made.
Future enhancements could include RSS feeds for each tech blog, cross-domain aggregation views, and automated alerting for certain types of infrastructure changes. For now, the system provides real-time transparency into development work across all four properties.
```