Building a Multi-Site Technical Blog Infrastructure with Automated Session Capture
Overview
This session involved architecting and implementing an automated technical documentation system across four independent sites: queenofsandiego.com, sailjada.com, dangerouscentaur.com, and burialsatseasandiego.com. The goal was to create granular, real-time technical blog posts that capture every development session, infrastructure change, and code modification—providing complete transparency for stakeholders like Sergio to review detailed technical work without exposing credentials or sensitive data.
Architecture: The Four-Part System
The solution consists of four integrated components:
- Session Hook: A Claude Code stop hook that triggers at the end of each development session
- Blog Generator: A Python script that parses session transcripts and generates HTML blog posts
- Infrastructure Init: Automation to provision S3, CloudFront, Route53, and ACM resources for each tech blog domain
- Navigation Integration: Links added to the Ship's Papers menu on each main site
Infrastructure Provisioning
Each of the four tech blogs required identical infrastructure patterns:
- S3 Bucket: Named
tech-[site]-content(e.g.,tech-qos-content,tech-jada-content) to store generated HTML posts with block public access disabled and versioning enabled - CloudFront Distribution: Pointing to each S3 bucket as origin with default root object
index.htmland/posts/*.htmlcaching rules - ACM Certificates: Leveraged existing wildcard certificates (
*.queenofsandiego.com,*.sailjada.com) for TLS encryption without additional cert provisioning - DNS Records: CNAME entries created at each domain's registrar (Route53 for queenofsandiego.com and sailjada.com; Namecheap for dangerouscentaur.com; GoDaddy for burialsatseasadiego.com)
The infrastructure script /Users/cb/Documents/repos/tools/tech_blog_init.py automates this process, creating CloudFormation-equivalent resources via AWS SDK calls for Route53-managed zones and direct AWS API calls for S3/CloudFront.
The Session Hook Mechanism
Claude Code's settings.json was modified to register a stop hook at /Users/cb/.claude/hooks/tech_blog_stop.sh. This shell script executes whenever a development session ends, triggering:
#!/bin/bash
python3 /Users/cb/Documents/repos/tools/tech_blog_generator.py \
--session-transcript "$CLAUDE_SESSION_TRANSCRIPT" \
--output-dir "/Users/cb/Documents/repos/sites/tech-blogs" \
--site "$CLAUDE_SITE_CONTEXT"
The hook passes the session transcript (available as a JSONL file in Claude's session directory) to the blog generator, which parses tool use entries, file modifications, and commands executed during the session.
Blog Generator: From Transcript to HTML
The blog generator (tech_blog_generator.py) performs several key operations:
- Transcript Parsing: Reads JSONL-formatted Claude session transcripts, extracting metadata (timestamp, tool use calls, text outputs)
- Credential Filtering: Strips passwords, API keys, tokens, AWS secret keys, and sensitive personal data using regex patterns and a blocklist
- File Change Tracking: Identifies modified/created files and organizes them by category (infrastructure, application code, configuration)
- Command Extraction: Captures shell commands executed and their high-level purpose
- HTML Generation: Constructs semantic HTML with
<h2>,<h3>,<ul>,<code>, and<pre>tags for readability - S3 Upload: Pushes generated HTML to the appropriate tech blog S3 bucket with
Content-Type: text/html - CloudFront Invalidation: Issues cache invalidations to
/posts/*to ensure fresh content is served immediately
The generator maintains a posts/index.html that chronologically lists all generated posts with links and summaries.
Domain-Specific Configurations
Each site required slightly different DNS handling:
- queenofsandiego.com & sailjada.com: Managed via Route53 hosted zones in the primary AWS account; CNAME records created directly via
boto3 - dangerouscentaur.com: Registered at Namecheap; CNAME manually configured to point to CloudFront distribution endpoint
- burialsatseasandiego.com: Registered at GoDaddy; ACM certificate DNS validation record added via GoDaddy API; CNAME configured for tech blog subdomain
All four distributions use identical cache behaviors: 3600-second TTL for HTML, 86400-second TTL for assets, and gzip compression enabled.
Navigation Integration
The Ship's Papers menu on each main site (queenofsandiego.com/index.html, etc.) was updated to include a "Technical Blog" link under a new "Development" submenu. This allows stakeholders like Sergio to easily access the technical documentation without navigating to a separate domain.
Security & Credential Sanitization
A critical requirement was ensuring no credentials, API keys, or sensitive data appeared in published posts. The generator implements multi-layer filtering:
- Regex patterns to detect AWS keys, API tokens, passwords in command examples
- Redaction of file paths containing
.env,.claude/credentials, or similar sensitive directories - Manual review triggers if certain patterns are detected (GoDaddy API references, AWS credentials in environment variables)
- Replacement of actual credentials with placeholder text like
[REDACTED_API_KEY]or[AWS_SECRET_KEY]
Key Decisions & Rationale
Why CloudFront instead of direct S3 hosting? CloudFront provides geographic distribution, automatic gzip compression, and HTTP/2 support. More importantly, it allows using wildcard certificates without managing individual domain certs for each tech blog subdomain.
Why a Stop hook instead of continuous monitoring? Development sessions are natural atomic units of work. Capturing at session boundaries ensures posts are coherent, well-structured narratives rather than fragmented real-time logs. This also reduces noise and makes posts more readable for technical review.
Why separate S3 buckets per site? This maintains clear separation of concerns, allows per-site access controls, and enables independent scaling if one tech blog grows significantly in traffic.
Why JSONL session transcripts? Claude Code stores sessions in JSONL format