Building a Multi-Site Technical Blog System with Auto-Generated Session Transcripts
What Was Done
Implemented an automated technical blog generation system that captures granular development work across four domain properties (queenofsandiego.com, sailjada.com, dangerouscentaur.com, and burialsatseasandiego.com) and publishes timestamped posts to dedicated tech subdomains. The system integrates with Claude Code's session management, extracting detailed session data, file modifications, and command execution records to generate detailed technical posts without exposing credentials or sensitive data.
Technical Architecture
Core Components
- Session Capture: Claude Code's Stop hook (
/Users/cb/.claude/hooks/tech_blog_stop.sh) executes at session end, triggering the blog generation pipeline - Blog Generator:
/Users/cb/Documents/repos/tools/tech_blog_generator.pyparses session JSONL transcripts, extracts tool use and file modifications, and generates HTML articles - Infrastructure Init:
/Users/cb/Documents/repos/tools/tech_blog_init.pyprovisions cloud resources for each tech blog domain - Credential Sanitizer: Regex-based content filtering removes AWS credentials, API keys, tokens, and personal data before publication
Cloud Infrastructure
Each tech blog runs on identical infrastructure:
- S3 Buckets: Named
tech-{domain}(e.g.,tech-queenofsandiego-com,tech-sailjada-com,tech-dangerouscentaur-com,tech-burialsatseasandiego-com) - CloudFront Distributions: Cache-enabled with 1-hour TTL for HTML, 1-year for static assets
- DNS: Route53 for queenofsandiego.com and sailjada.com subdomains; Namecheap CNAME for dangerouscentaur; GoDaddy CNAME for burialsatseasandiego
- SSL/TLS: Wildcard ACM certificates (
*.queenofsandiego.com,*.sailjada.com) with automated DNS validation
Implementation Details
Session Transcript Parsing
The blog generator reads Claude Code's session transcript files (stored as JSONL in ~/.claude/sessions) and extracts:
- User request context and goals
- Tool invocations with command names and arguments
- File write/edit operations with exact paths
- Commands executed in the local shell
- Agent reasoning and decision logs
Example extraction from transcript:
Files modified/created:
- Write: /Users/cb/Documents/repos/tools/tech_blog_generator.py
- Edit: /Users/cb/.claude/settings.json
- Write: /Users/cb/.claude/hooks/tech_blog_stop.sh
Commands run:
- Create S3, CloudFront, and DNS for qos, jada, and dc tech blogs
- Test the blog generator on the current session transcript
- Check CloudFront deployment status and DNS propagation
Credential Sanitization
The generator applies a multi-layered sanitization pipeline:
- Redacts AWS access keys, secret keys, and session tokens (40+ character alphanumeric patterns)
- Masks API keys, Bearer tokens, and authorization headers
- Removes PII: phone numbers, email addresses (except generic examples)
- Filters environment variable values while preserving variable names for context
- Strips database connection strings and private key material
HTML Generation
The generator creates semantically structured HTML articles:
- Metadata: Publication timestamp, session ID, domain association
- Structured Sections: What Was Done, Technical Details, Infrastructure Changes, Key Decisions
- Code Examples: Command-line examples with syntax highlighting via inline
<pre><code>blocks - Navigation: Breadcrumbs to parent site, previous/next post links
- Search Optimization: Semantic HTML5, descriptive headings, alt text on diagrams
Infrastructure Provisioning
S3 Bucket Configuration
Each bucket is created with:
- Static website hosting enabled (index.html as root)
- Public read access via bucket policy for CloudFront origin
- Versioning enabled for recovery of previous posts
- Block public access disabled only for CloudFront OAI (Origin Access Identity)
CloudFront Distribution Setup
Distributions are configured with:
- Origin: S3 bucket as origin domain
- Caching: 3600 seconds (1 hour) TTL for HTML; 31536000 (1 year) for images/assets
- SSL/TLS: HTTPS only, matching wildcard certificate for parent domain
- Compression: Automatic gzip/brotli compression for text-based content
- Error Pages: Custom 404 page pointing to blog index
DNS Configuration
Domain-specific DNS setups:
- queenofsandiego.com & sailjada.com: Route53 alias records pointing to CloudFront distribution domains
- dangerouscentaur.com: Namecheap CNAME records (dangerouscentaur uses wildcard CloudFront distribution
E2Q4UU71SRNTMBwithdc-sitesS3 bucket) - burialsatseasandiego.com: GoDaddy DNS CNAME records with automated ACM DNS validation
Integration with Site Navigation
The "Ship's Papers" dropdown menu in each site's index.html now includes a "Technical Blog" link pointing to tech.{domain}.com. This provides Sergio and other stakeholders direct visibility into development progress and architectural decisions.
Key Decisions
Why Auto-Generation from Session Transcripts?
Capturing work at the session level ensures no development activity is missed. Manual blog posts would require discipline to write and would inevitably have gaps. The session-based approach is automatic and granular—every file change, every command executed, every decision is documented.
Why Separate Tech Blogs per Domain?
Each property (Queen of San Diego, Sail Jada, Dangerous Centaur, Burials at Sea) has independent operations, revenue models, and stakeholder groups. Separate technical blogs allow:
- Domain-specific audience (Sergio sees QOS work, other operators see their own site work