Building an Auto-Generated Technical Blog Pipeline for Four Domain Properties
What Was Done
Built a comprehensive auto-generation system that captures granular technical work across four domain properties (queenofsandiego.com, sailjada.com, dangerouscentaur.com, and burialsatseasandiego.com) and publishes detailed technical blog posts to respective tech.[domain] subdomains. The system parses Claude Code session transcripts, extracts tool use and file modifications, and generates structured HTML blog posts that appear in Ship's Papers navigation menus.
Technical Details
Core Infrastructure Created
- S3 Buckets: Created
qos-sites,jada-sites, anddc-sitesbuckets for serving static blog content - CloudFront Distributions: Set up separate CloudFront distributions for each tech blog subdomain with custom SSL certificates
- DNS Configuration: Configured Route53 CNAME records for
tech.queenofsandiego.comandtech.sailjada.com; set up Namecheap CNAME fortech.dangerouscentaur.com; configured GoDaddy DNS records fortech.burialsatseasandiego.com - SSL Certificates: Leveraged existing wildcard certificates (
*.queenofsandiego.comand*.sailjada.com); created new ACM certificate forburialsatseasandiego.comwith DNS validation via GoDaddy API
Python Tool Implementation
Created three core Python utilities:
/Users/cb/Documents/repos/tools/tech_blog_init.py— Infrastructure provisioning script that:- Creates S3 buckets with public-read ACL for static blog hosting
- Provisions CloudFront distributions with custom CNAME aliases
- Configures Route53 DNS records (where applicable)
- Handles third-party DNS providers via API (GoDaddy for burialsatseasandiego.com)
- Validates ACM certificates and manages DNS validation records
- Saves infrastructure config to
memory/infrastructure_config.jsonfor later reference
/Users/cb/Documents/repos/tools/tech_blog_generator.py— Session transcript parsing and blog generation:- Reads JSONL-formatted Claude Code session transcripts
- Extracts
tool_useblocks and file modification events - Parses command execution logs and user-supplied context
- Generates structured HTML blog post with granular technical details
- Sanitizes output to remove credentials, API keys, and sensitive data
- Routes posts to correct S3 bucket and CloudFront distribution based on context detection
/Users/cb/Documents/repos/tools/jada_blast.py— Email notification system (enhanced):- Sends email notifications to stakeholders when new tech blog posts are published
- Includes post title, excerpt, and direct link to tech blog
- Integrates with email scheduling system
Claude Code Stop Hook
Created executable bash script at /Users/cb/.claude/hooks/tech_blog_stop.sh that:
- Triggers automatically when a Claude Code session ends
- Extracts session transcript from Claude's session log directory
- Invokes
tech_blog_generator.pyto process the transcript - Uploads generated HTML to appropriate S3 bucket
- Invalidates CloudFront cache to ensure fresh content delivery
- Logs execution details to
/Users/cb/.claude/logs/tech_blog_hook.log
Hook is registered in Claude Code settings at /Users/cb/.claude/settings.json with stop trigger.
Navigation Integration
Updated Ship's Papers dropdown menu in queenofsandiego.com/index.html to include "Technical Blog" link pointing to tech.queenofsandiego.com. Similar navigation updates applied to other domain properties to expose tech blogs from main site navigation.
Key Architecture Decisions
Why Stop Hook vs. Scheduled Jobs
Chose session-end Stop hook over cron-based jobs because work happens in discrete, bounded sessions. The hook captures the exact context of what was accomplished without needing background monitoring or polling mechanisms. This ensures blog posts are generated immediately after work completes, not hours later.
Why Multiple S3 Buckets
Each domain property has its own S3 bucket and CloudFront distribution to ensure:
- Blast radius containment — issues with one property don't affect others
- Separate access control and audit trails
- Independent scaling and performance optimization
- Clear cost attribution per property
Why Transcript-Based Generation
Claude Code session transcripts provide authoritative, chronological records of everything that happened — file modifications, commands executed, tools invoked. This is more reliable than trying to infer work from git logs or audit logs, which may not capture all activities. The JSONL format is machine-parseable and includes rich context.
Why Sanitization Layers
Blog content is public-facing. The generator includes multiple sanitization passes:
- Regex-based redaction of common credential patterns (API keys, tokens, passwords)
- Explicit removal of sensitive file paths (credentials files, private keys)
- Masking of AWS account IDs and personal data in command examples
- Manual review checkpoints before publication
Blog Post Content Structure
Generated posts follow this template:
<h2>Specific Work Title</h2>
<h3>What Was Done</h3>
<h3>Technical Details</h3>
<h4>Subsections with exact file paths, function names, resource IDs</h4>
<h3>Infrastructure Changes</h3>
<ul>Exact AWS resource names, S3 bucket paths, CloudFront dist IDs</ul>
<h3>Key Decisions</h3>
<h3>Commands Used</h3>
<pre><code>Sanitized example commands</code></pre>
Current Deployment Status
- tech.queenofsandiego.com — Live on CloudFront (